As part of my work with Maharashtra Educational Society and with Azeem Technologies, I am often asked for advice on how to automate certain processes with as little effort (and cost) as possible.
A few weeks back, the Principal of H. K. College of Pharmacy expressed a desire to automate generation of semester marksheets that are handed out to students. It appears that the procedure for verification of marks by University of Mumbai involves sending a specially formatted spreadsheet that is then returned back to the college (after validation and any necessary corrections) for issuing the students their marksheets. Unfortunately, the next step entails re-typing the marks into a specially formatted Word document (with custom fonts, styles, logos etc.). Since the marks need to be absolutely correct, teachers are asked to verify and cross check the work of the office assistant and this is a laborious process for all involved. The question was, can this second step be automated?
The obvious solution to this problem is to extract the values from the Excel spreadsheet and feed it to the Word document through some program/script. A litte digging revealed that this requires knowledge of VBScript and some of the internals of how Microsoft Office documents are structured. I quickly decided that I was not going to do this since I had no desire to learn a new language just to solve this problem.
I had seen the output of the Word document which was just a series of pages with each student's marks on a single page. So I wondered if it would be possible to somehow create the output (i.e. all the marksheets) as a single PDF file. I decided to give it a go using my knowledge of LaTeX . LaTeX can be used to prepare professionally typeset documents and so appears to be a natural choice for this purpose. I wrote a Java program that extracted the values from a CSV formatted spreadsheet and then created the appropriate .tex file. I ran into a few problems:
- I needed the table column headers to be centred and all other rows to left justified. LaTeX does not allow this by default. A friend (Thanks Santhana) suggested a workaround: use \parbox for the text in the column header.
- The second problem was the font. The Word document used Cambria and LaTeX cannot use that. As a result, text for each student was not fitting on a single page. Using some custom sizes and judicious use of \footnotesize, I was able to get the text to fit on one page
- A peculiar problem that arose was the fact that when the name of a particular course subject was exceptionally long, a few columns of the table would fall off the page. Once again, \parbox came to the rescue.
- The number of subjects in each semester varies. As a result, the spacing between the end of the marks table and the area for the Principal's signature had to be adjusted accordingly using the \vspace command. This was easy.
As I pondered on what it would take to run a java program followed by LaTex, I thought that perhaps there was a simpler solution after all. Why not submit the spreadsheet to a web server that would then parse the CSV (using PHP for the parsing), produce the marksheets and then the office staff just print it? It would be able to use colours, custom fonts, have a GUI front-end and it would do the job perhaps even more quickly. Very appealing to everyone involved including me. So I did exactly that.
A web front end would require some CSS. Since CSS is still unexplored territory for me, I had to spend some time trying to understand what is feasible to do with CSS and what happens when you try to print a web page. Remember that web pages are supposed to adapt to any screen width and can be infinitely long. Not at all similar to the characteristics of a fixed page size book. Luckily, I am not the trailblazer in this area and I found a number of articles of help. They were not enough but they came close. A list apart had great advice on what to skip in printing. Smashing Magazine provided useful instructions on creating a CSS file specially for printing. One question I had in my mind on how to force page breaks was answered on StackOverflow . There was advice on debugging print style sheets on Slideshare, StackOverflow and by Steve Workman. Particle Lab had some useful advice on the tools setup.
With the help of all these articles, I created a custom CSS stylesheet that substituted the values from pixels to cm and em and overwrote default Bootstrap behaviour using !important. After the initial set of changes, I had to to take a printout, take a foot long ruler/scale and measure accurately the spaces, margins etc. Go back and adjust the values. Repeat. After a few cycles, I installed a PDF print driver that would allow the resultant web page to be printed as a PDF and subsequently, once the major values were nailed down, I did not have to take any more printouts. It was looking great. When I went to demo it to the college staff, I got a shock. The print stylesheet was failing for Chrome and IE and only worked on FireFox. That however is a problem to be solved another day. For now, the college will use Firefox to evaluate this automated solution.