Android WebView and Indian Languages

For one of my projects I needed some display functionality that is absent in Android, data visualization being one prime example. If you search, there are a number of open source data visualization libraries available for Android such as AChartEngine but none of them on par with some of the best ones on the web ( d3.js is one notable example). The ones on the web are mostly written in Javascript. Writing my own library seemed too much work for little gain especially since it is sort of re-inventing the wheel. So the obvious answer is to re-code or re-use those Javascript libraries. Javascript is something that I never really learned so I decided to see if I could re-use those libraries. Turned out you can.

I realized that Android apps can contain a View known as WebView . What this WebView does is load and display any HTML code that can be handled by the internal Android HTML engine (WebKit). There are two member methods that enable you to do this: loadData and loadDataWithBaseURL . As stated in the documentation, loadData will not allow any code (such as Javascript) to be loaded other than what is present in the data. This effectively makes use of a Javascript library (such as d3.js stored in the assets folder) impossible. So the alternative is to use loadDataWithBaseUrl and that's what I did. It worked great!

A few weeks later I was working to add support for the default language (which as we know in India can be one of the many regional languages ). Android 2.3.3 by default supports none of them even though it has Unicode support. In Android 4.0, selecting only Hindi or Bangla as the default language and input became possible. Better than nothing though. So I went ahead and coded some strings in Hindi to try it out. I ran the app in a 4.4 KitKat emulator. Imagine my surprise when I saw nothing except boxes (or empty spaces in the case of WebView) where the Hindi text should have been. What was going on? A search on the net alluded that every other Indian developer has noticed the same thing on the emulator but they were able to see the text on a real phone. So I ran the app on a tablet with Android 4.4 and yes indeed the text was appearing fine. This was strange behaviour. After a little bit of digging I realized that the Android emulator does not have support for Hindi or Bangla as language input even though the emulator was running Android 4.4 KitKat. Clearly, the emulator developers have neglected to add support for Indian languages.

So next I added some text for the Gujarati language. Once again I was seeing nothing on both the emulator and the tablet. I thought perhaps my text had not been saved correctly in the utf-8 format. But when I opened the file in Notepad++ and in Eclipse, it looked fine. So what was the problem? Another hour or so spent reading articles on the net I came to the realization that although Android supports Unicode, it does not have font support for all the languages (including Gujarati). (Interestingly, Some Indian phone manufacturers are able to render regional language fonts by adding them to the base AOSP). Suggestions on the net to resolve my problem were mostly of the type Include the particular language's font as an asset in your app or in a more dramatic fashion Root your phone, add your font to the system and now all apps will be able to see that language text. Neither of these seemed appealing. The question that I had was: how was it that Windows was able to show me the font of all the different languages I was trying? Ignoring the legal aspects, could I possibly include that font and be able to support every language in one go? So I decided to search for the font in Windows and I found that Windows has a single font ARIALUNI.TTF that supports all the languages that Windows has. Unfortunately, even ignoring the legal aspects, adding that font to my app is not appealing because the font is 22.1 MB in size. My app as of now is only about 2.5 MB. So I have decided to not support languages other than Hindi for now.

Towards the end of all this, I decided to see how well the app performed on a Gingerbread device. So changed the android:minSdkVersion version to 9, recompiled my app and ran it on a 2.3.4 Samsung device. I was shocked when instead of the HTML that I had so far been seeing on the tablet, I saw raw HTML code inside the app's WebView on the Gingerbread phone. I rapidly went through the code, searched the net and I could not find anything wrong. Eventually I backtracked all the way towards the start of the coding process and started going through the changes line by line. As part of my earlier debugging for why Hindi was not appearing on the emulator I had followed numerous suggestions from the net. One of them was that the mimeType value for loadDataWithBaseURL should be set to "text/html; charset=utf-8" to support Unicode HTML because Android 4.0 ignores the character encoding inside the HTML. After a number of experiments I was able to determine that this mimeType was wrong. It should only be "text/html". I however did have to set a meta tag inside the HTML code to utf-8 <meta charset="utf-8">. Once I did that, once again, instead of raw HTML code, a proper webpage was being displayed on both the 2.3.3 and 4.4 devices.

Whew. What a strange series of quirks in trying to support Indian regional languages. It was a long week but all's well that ends well.