Marcelo Arenas
Claudio Gutierrez
Bijan Parsia
Jorge Pérez
Axel Polleres
Andy Seaborne
After the data and ontology layers of the Semantic Web stack have achieved considerable stability through standard recommendations such as RDF and OWL, the query layer is the next item to be completed on W3C's agenda. This layer is realized by the SPARQL Protocol and RDF Query Language (SPARQL) currently under development by W3C's Data Access working group (DAWG). Although the SPARQL specification is not yet 100% stable - it just dropped a step back from candidate recommendation to working draft - people are taking up this specification at tremendous pace, driven by the strong need for a long awaited standard in querying the Semantic Web and being able to make use of the advantages of RDF together with common metadata-vocabularies at large scale.
This is just the right moment to reflect on the current state of the language and its applications, which we aim to provide in the proposed tutorial. The contributions of this tutorial will be along two complementary streams: On the one hand we will provide a practical introduction to SPARQL for newcomers, giving examples from various application domains, providing formal underpinnings and guiding attendees through the jungle of existing implementations, including those which reach beyond the current specification to query more expressive semantic web languages than RDF alone. Thus, participants will get a clear sense of the language as it is specified and as it exists in implementations. On the other hand, we will go further into depth of theoretical foundations of SPARQL, presenting recent results of SPARQL's complexity, formal foundations in terms of database theory, as well as its exact semantic relation to the other building blocks in the SW stack, namely, RDF Schema, OWL and the upcoming rules layer. Finally, we aim to bring these two streams together, and will identify the current limitations and challenges around SPARQL, pointing to possible extensions and emerging application fields.
The presenters of this tutorial tackle the topic of Semantic Web querying and SPARQL from various, complementary viewpoints. Andy Seaborne co-chairs the Data Access working group (DAWG) which is responsible for the development of SPARQL. The group from the Center for Web Research - Chile1, Marcelo Arenas, Claudio Gutierrez, and Jorge Pérez, with their long-term experience and excellent record in database technologies for the Web and Semantic Web foundations, is main responsible for recent successes towards more formal backgrounds of SPARQL, and provided best-paper winning results on semantics and complexity of the language. Bijan Parsia was a long-term member of the DAWG and was also involved in the development of OWL. Axel Polleres has a strong background in deductive databases and rules and is a member of the Rules Interchange format (RIF) working group which currently works on the Semantic Web rules layer, the subsequent building block of the SW stack.
After the tutorial, attendees new to SPARQL should be able to formulate queries, understand the differences and overlaps of SPARQL with traditional Database query languages like SQL and have sufficient insight to understand issues in existing SPARQL engines that might affect their applications. The theoretical background given in the second half of the tutorial will provide deeper understanding of SPARQL's underlying semantics and complexity. We will see how corner cases are handled and understand limitations of the current language. Moreover, we will provide a detailed picture of SPARQL's position in the space of related Semantic Web standards. Finally we will give an outlook to emerging research challenges and possible future directions. Compared with previous, related tutorials at other conferences such as at WWW2005, we can offer a much more comprehensive view reporting significant progress on the language specification especially with respect to theoretical foundations of SPARQL, but also with respect to implementation experience, all of which influenced the selection of covered topics.
The tutorial will be divided in two main parts: The morning part covering primarily the practical side of SPARQL, and an afternoon part going more into depth towards the foundational aspects of SPARQL and discussing Semantic Web data access in the bigger context of related standards.
The first part of the tutorial will have emphasis in providing a solid basis in the semantics and use of the language. This part will be of special interest for a wide range of attendees, from initial users wanting to make their first steps in SPARQL, to advanced developers willing to use the full potential of the SPARQL language in Semantic Web applications.
The second part will be centered around the formalization of the language, discussing corner cases, and open interesting problems raised by SPARQL and RDF data access, relating this theory with well known topics from Database theory. Moreover, the relation to OWL and RDFS as well as aspects of practically evaluating SPARQL over ontologies will be covered in detail. Finally we will discuss novel aspects such as the emerging Semantic Web Rules layer, where we will discuss the use of SPARQL itself as a powerful rules language as well as its relation to other deductive rules languages. Extensions to the normative specification of the language of practical use will also be discussed with emphasis in the theoretical implications/problems that they raise. The goal of this second part of the tutorial is to give people a theoretical framework for further research in, around and beyond SPARQL and RDF+OWL data access. The intended audience of this part are primarily researchers with some background on RDF, query languages, and complexity theory, although we will provide sufficient detail to convey the most important take-home messages for any attendant with a solid computer science background.
The whole tutorial is conceived as a full-day event which will take beginners from simple examples all the way through recent research and standardization challenges. Attendants with a mainly practical emphasis will receive a complete introduction with useful hints to start using and implementing SPARQL and SPARQL-based applications in the morning session. People with a more theoretical interest to the language and already familiar with the language basics might decide to attend only the afternoon session which complements and deepens the practical part form the morning session in several ways. Without claiming that these two parts should be viewed completely self-contained, we aim to allow for a logical entry/exit point also for people interested in parallel tutorials, thus also making the single parts ``consumable'' independently with the suggested structure.
More specifically, the tutorial will be organized in six units, where Units 1-3 mark the morning part and Units 4-6 the afternoon part, as follows.
Hands-On Experience: The sessions focusing on practical aspects (especially Units 1, 3, but also 5) will have a hands-on part in the end of the respective unit where participants will have to solve simple practical examples under the guidance of the presenters. Engines and tools needed will be made available via a common Web-interface on a dedicated Tutorial webpage (see also Section 3.1).
Breaks: Coffee Breaks are foreseen after Unit 1 and Unit 4, but lengths of each unit may be adapted according to the fixed schedule prescribed by the ESWC organizers, if necessary.
The attendee will be able to access a SPARQL engine via an online interface on the tutorial homepage to interactively follow presented examples. Especially in the first part we will make use of this interface in order to directly practice the learned features. On the tutorial homepage, we also plan to make accessible alternative engines and tools presented in Unit 3, as far as possible. Further, the complete slide sets shall be available in PDF format online on this page for attendees to conveniently follow the presentations on their laptops which has didactic advantages compared to hard copies only (e.g. color, overlays).
Technical requirements. We need a room with beamer and wireless internet connection. We expect participants to practice using their own laptops. We made positive experiences already in a similar setting for a tutorial on ``Answer Set Programming for the Semantic Web'' at last year's ESWC3. There, we provided a backup WiFi access point (which we can provide again) to a dedicated locally accessible server only for tutorial attendees in order to guarantee independence of network availability.
The tutorial is mainly directed to two categories of attendees:
-Beginners. Attendees with minimal knowledge about Semantic Web data access and querying will especially take advantage of Units 1, 2 and 3 and deepen their understanding in the remaining sessions.
-Expert and intermediate. The researcher with good general background in Database theory, query languages, and complexity theory, possibly seeking new and open research challenges in the context of SPARQL will benefit most from Units 4, 5, and 6.
Prerequisites. Although no specific knowledge beyond basic RDF is needed as a prerequisite, a certain background in computer science and database theory will allow attendees to better understand and follow the tutorial.
Semantic Web data access and querying are the key enabler to make Tim Berners Lee's often claimed vision of making ``all the data in the world look like one huge database'' come true. In this sense, the query layer which is now short before completion will play a central part in the Semantic Web's further development and take-up. The large number of paper submissions to ESWC's research track concerned with issues around SPARQL, despite the standard's pre-final state underline the importance of this issue for the whole community. As mentioned before, we think this is an ideal moment to take a reflecting view on the current state of the specification and its applications, which we aim to provide in the proposed tutorial. Also we are convinced that the topic might attract both practitioners from industry as well as scientists and we carefully chose the topics presented to serve both these audiences. As it is one of main objectives of the conferences to cater for both these groups of possible attendees and bring them together in a common frame event, we hope to have presented an attractive proposal for a tutorial meeting precisely this objective.
Marcelo Arenas, Pontificia Universidad Católica de Chile.
Home Page: http://www.ing.puc.cl/~marenas
Short Bio: Prof. Marcelo Arenas received B.Sc. degrees in Mathematics (1997) and Computer Engineering (1998) and a M.Sc. degree in Computer Science (1998) from the Pontificia Universidad Católica de Chile, and a Ph.D. degree in Computer Science (2005) from the University of Toronto, Canada. In 2005, he joined the Computer Science Department at the Pontificia Universidad Católica de Chile as an Assistant Professor. His research interests are in different aspects of database theory, such as expressive power of query languages, database semantics, integrity constraints, inconsistency handling, database design, XML databases, data exchange and database aspects of the semantic web. Marcelo has received an IBM Ph.D. Fellowship (2004), three best paper awards (PODS 2003 in San Diego, California, PODS 2005 in Baltimore, Maryland and ISWC 2006 in Athens, Georgia) and an Honorable Mention Award in 2006 from the ACM Special Interest Group on Management of Data (SIGMOD) for his Ph.D dissertation, ``Design Principles for XML Data.''
Teaching experience: Dr. Arenas has experience in teaching several university lectures in the topics of Databases, XML and Logic for Computer Science. Moreover, in January of 2007 he will be giving a tutorial on foundations of RDF and SPARQL at the University of Edinburgh.
Selected Related Publications:
Claudio Gutierrez, Department of Computer Science,
Universidad de Chile.
Home page: http://www.dcc.uchile.cl/cgutierr/.
Short Bio: Claudio Gutierrez received degrees in mathematics and mathematical logic from Universidad de Chile and Pontificia Universidad Católica de Chile, and a Ph.D. degree in computer science from Wesleyan University, U.S.A. Currently, he is associated professor in the Computer Science Department at the Universidad de Chile, and associated researcher at the Center for Web Research. His research interest lies in the intersection of databases and the Semantic Web. He has received best research paper awards at the European Semantic Web Conference in 2005, and at the International Semantic Web Conference in 2006.
Teaching experience: C. Gutierrez has taught in several universities at undergraduate and graduate level, particularly on databases and Semantic Web.
Selected Related Publications:
Bijan Parsia, Information Management Group, School of Computer Science - University of Manchester, UK
Home page: http://homepages.manchester.ac.uk/~bparsia/
.
Short Bio: Bijan Parsia is a lecturer (since 2006) in the School of Computer Science at the University of Manchester, UK. He has published over 50 papers in such areas as description logic reasoning, explanation, trust, ontology editing, planning, web service composition, ontology partitioning, and ontology visualization. He has been a member of the WSDL, WS-Architecture, Data Access, and WS-Policy working groups.
Teaching experience: He has experience in teaching several university lectures in Knowledge Representation and the Semantic Web. He co-organized a tutorial entitled, ``Learning from the Masters: Understanding Ontologies on the Web'' at ISWC 2007 and lectured on SPARQL at the 2006 Reasoning Web Summer School.
Selected Related Publications:
Jorge Pérez, Universidad de Talca - Chile.
Home Page: http://ing.utalca.cl/~jperez
Short Bio: Jorge Pérez received a B.Sc. degree in Computer Engineering and a M.Sc. degree in Computer Science from the Pontificia Universidad Católica de Chile. He is currently an Instructor Professor of the Computer Science Department at Universidad de Talca, and a Ph.D. student under the supervision of Prof. Marcelo Arenas. His research interests are primarily in database theory and the application of database technologies to the Web. Jorge has received the best research paper award at the 5th International Semantic Web Conference for work on SPARQL formalization from a database perspective.
Teaching experience: Jorge Pérez has experience in teaching several undergraduate courses lying in the core part of Computer Science curricula like Discrete Mathematics, Automata Theory, Algorithms and Datastructures, and Databases.
Selected Related Publications:
Axel Polleres, Universidad Rey Juan Carlos, Madrid.
Home page: http://www.polleres.net.
Short Bio:
Dr Axel Polleres obtained his PhD in Computer Science at the Vienna University of Technology in 2003.
He was working at DERI at the Leopold-Franzens Universitaet Innsbruck in the areas of Semantic Web
Services, Ontologies, Rules Languages and Logic Programming from 2003 to early 2006. Continuing this research he currently works at
Universidad Rey Juan Carlos, Madrid, under a ``Juan de la Cierva'' research fellowship.
Dr. Polleres published more than 30 articles in journals, books and as refereed Conference and Workshop contributions.
Ongoing research projects and working groups he is participating in include WSMO, WSML, and the W3C
Rule Interchange Format (RIF) WG.
Teaching experience:
Dr. Polleres has experience in teaching several university
lectures and training courses in the topics of Logic Programming, Artificial Intelligence,
Semantic Web and Web Services. Moreover, he co-organized a full-day tutorial on the topic of
``Answer Set Programming for the Semantic Web'' at last year's ESWC and will be a presenter at
this year's Reasoning Web summer school.
Selected Related Publications:
Andy Seaborne, Hewlett-Packard Laboratories.
Home page: http://www.hpl.hp.com/people/afs/.
Short Bio: Dr. Andy Seaborne is a member of the Semantic Web Research Group in Hewlett-Packard Laboratories and he is based in Bristol, UK. He has been involved in RDF query languages since 2001, firstly with the development of RDQL for the Jena framework and latterly with the development of SPARQL. He is co-editor of the SPARQL query language specification. In addition, he has built two implementations of SPARQL, one, a reference implementation of SPARQL and one is a query engine that that is based on SQL.
Teaching experience: Dr. Seaborne gave, among others, tutorials on Jena at ISWC 2002, and on SPARQL at WWW2005 and at the 2006 Jena User Confernce.
Selected Related Publications:
This document was generated using the LaTeX2HTML translator Version 2002-2-1 (1.71)
Copyright © 1993, 1994, 1995, 1996,
Nikos Drakos,
Computer Based Learning Unit, University of Leeds.
Copyright © 1997, 1998, 1999,
Ross Moore,
Mathematics Department, Macquarie University, Sydney.
The command line arguments were:
latex2html sparql-tutorial_final.tex -split 0
The translation was initiated by Axel Polleres on 2007-02-01