I am trying to submit a form using
mechanize and the returned HTML will contain Arabic. However, the HTML contains characters like:
\xd8\xa7\xd9\x8e\xd9\x84\xd9\x92 in places where there should be Arabic.
The original code was here
And the modified I am trying on is:
import mechanize url = "https://html.duckduckgo.com/html" br = mechanize.Browser() br.set_handle_robots(False) # ignore robots br.open(url) br.select_form(name='x') br["q"] = 'Arabic' res = br.submit() content = res.read() content = str(content) content = content.replace(r'\n', '\n') content = content.replace(r'\r', '') content = content.replace(r'\t', '') with open("mechanize_results.html", "w") as f: f.write(str(content))
The question is: How to display a proper HTML page with Arabic instead of strange characters