Formats for Publishing Information on the Web-page of the Parliament

News | Publications | Open Government | Article 27 August 2015

The Open Parliament 2015-2016 Action Plan, under the sub-chapter 3.1 Posting Documents on the Website of the Parliament of Georgia in an Editable Format, states that the Parliament of Georgia publishes information on its website in PDF format; this makes it difficult for the user to change the file and generally to manipulate text. Thus, the Action Plan proposes more reader friendly, editable formats such as HTML, MS Word, etc.


Currently, PDF (Portable Document Format) is the most popular format for publishing information among the governmental institutions and the researchers worldwide. However, PDF has many shortcomings along with its benefits.


Many websites and various institutions provide information in PDF while it may be less usable and less efficient than other publishing formats, both for the publishers and the readers.


Among the benefits of using PDF, it is widely acclaimed that the PDF is recommended format for the documents that readers intend to download and/or print. PDF is always the exact replica of a final, printed product. However, browsing and searching PDF content via search engines is inefficient since PDF is not as SEO (Search Engine Optimization) friendly as HTML. Thus, it makes hard to find the information a user may need in the first place and second, it is not very practical format for sharing information from the content, images, charts, tables, etc. that are placed in the PDF unless the reader wants to share the entire document.


Furthermore, the PDF requires a supporting software - Adobe Reader, to be accessed and in most cases needs paid software in order to make editorial or markup actions in the document.


It is notable that once a user downloads the PDF, no further changes or updates will be delivered to the document. Therefore, PDF is hard to reuse due to its format which is complicated for an average user to edit.


One of the disadvantages of publishing documents in PDFs by the public institutions is revealed in scanned pictures of the signed agreements, memorandums, etc. Such scanned pictures of government's documents can be seen but not searched or sorted from the search engines. Utilizing more effective format for disclosing such agreements is essential for allowing the citizens to access these information more easily. HTML is the most efficient way to share non-textual content online for the search engines to index them.


Another disadvantage of publishing information in the PDF is that it does not inform the publisher about any user-behavior related data, except of number of downloads.


It is true that PDF is also rather burdensome for screen reading, since it is not a web-page but more of a solid material and does not fit a reader's screen. Thus, it is unpleasant to read and hard to navigate in the document. However, PDF is definitely a very useful format for drafting a newsletter that is designed for the printed paper.


On the other hand, another format for publishing the information - HTML (HyperText Markup Language) is characterized by certain features that may come handy for the publishers and readers online.


First of all, HTML is more widely accessible format than a PDF, as HTML only requires web browser for displaying the information and most importantly, this format enables users to find the HTML document content from search engines more efficiently.


HTML is publisher friendly, since its content is measurable. For instance, publishers can know how much time users spent on a certain page and what information she/he used. This might be a very significant feature especially when we are talking about the information the legislative body and public institutions publish, HTML is a backbone of web and web can assist the publisher to understand his/her community better and learn what documents, which particular parts of it is more relevant for the people.


Moreover, HTML easily allows hypertexts in the document and its Fonts and appearance is defined by the web page creator as it does not have embedded fonts like the PDF.


In conclusion, PDF format is more useful for printing while HTML is more search engine - optimized, convenient, editable, practical and efficient format - for using the documents on the web. According to the sub-chapter - 3.1 Posting Documents on the Website of the Parliament of Georgia in an Editable Format - commitment noted in the Open Parliament Georgia Action Plan 2015-2016, the Parliament of Georgia should reduce number of documents published in the PDF format. This practice will significantly simplify not only access to the published documents but their reuse and further modification by the citizens.

Other Publications on This Issue