Through empirical evaluation, we show that ParDOM yields better scalability than PXP – a recently proposed parallel DOM parsing algorithm – on commodity multicore processors. ParDOM can be conveniently implemented using a data parallel programming model that supports map and sort operations. ParDOM offers fine-grained parallelism by adopting a flexible chunking scheme – each chunk can contain an arbitrary number of start and end XML tags that are not necessarily matched. In the second phase, partial DOM node tree structures created during the first phase, are linked together (in parallel) to build a complete DOM node tree. In the first phase, an XML document is partitioned into chunks and parsed in parallel. We propose a data parallel algorithm called ParDOM for XML DOM parsing, that builds an in-memory tree structure for an XML document. Leveraging multicore processors can offer a cost-effective solution, because future multicore processors will support hundreds of cores, and will offer a high degree of parallelism in hardware. We believe that parallelism is a natural way to boost performance. This operation is known to cause performance bottlenecks in applications and systems that process large volumes of XML data. XML parsing is a core operation performed on an XML document for it to be accessed and manipulated. The extensible markup language XML has become the de facto standard for information representation and interchange on the Internet.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |