Friday, October 23, 2020

ListenData: Translating Web Page while Scraping

Suppose you need to translate web page while scraping data from the website in R and Python. In google chrome, there is an option (or functionality) to translate any foreign language. If you are an english speaker and don't know any other foreign language and you want to extract data from the website which does not have option to convert language to English, this article would help you how to perform translation of a webpage.
Translate webpage

What is Selenium?

You may not familiar with Selenium so it is important to understand the background. Selenium is an open-source tool which is very popular in testing domain and used for automating web browsers. It allows you to write test scripts in several programming languages. Selenium is available in both R and Python.

Translate Page in Web Scraping in R and Python

In R there is a package named RSelenium whereas Selenium can be installed by installing selenium package in Python. Following is a list of languages chrome supports along with their code. You need this code in making chrome understand from which language to what language you want to translate the web page.
Name Code
Amharic am
Arabic ar
Basque eu
Bengali bn
English (UK) en-GB
Portuguese (Brazil) pt-BR
Bulgarian bg
Catalan ca
Cherokee chr
Croatian hr
Czech cs
Danish da
Dutch nl
English (US) en
Estonian et
Filipino fil
Finnish fi
French fr
German de
Greek el
Gujarati gu
Hebrew iw
Hindi hi
Hungarian hu
Icelandic is
Indonesian id
Italian it
Japanese ja
Kannada kn
Korean ko
Latvian lv
Lithuanian lt
Malay ms
Malayalam ml
Marathi mr
Norwegian no
Polish pl
Portuguese (Portugal) pt-PT
Romanian ro
Russian ru
Serbian sr
Chinese (PRC) zh-CN
Slovak sk
Slovenian sl
Spanish es
Swahili sw
Swedish sv
Tamil ta
Telugu te
Thai th
Chinese (Taiwan) zh-TW
Turkish tr
Urdu ur
Ukrainian uk
Vietnamese vi
Welsh cy
READ MORE »

from Planet Python
via read more

No comments:

Post a Comment

TestDriven.io: Working with Static and Media Files in Django

This article looks at how to work with static and media files in a Django project, locally and in production. from Planet Python via read...