When Bacon Strikes Back

SPARQL is a neat language for querying datasets stored in the RDF format, an example of which is DBpedia, a structured form of Wikipedia. The sending of a query to DBpedia from Java code is trivial, but the exact format of the SPARQL query needs to be considered in case it contains SPARQL-reserved characters or syntax. I couldn't find an easy SPARQL escaper, so I wrote one myself.

I use SPARQL to query DBpedia from OneMusicAPI. OneMusicAPI consumes data from DBpedia and presents it as results to metadata queries for particular music releases. If such a query contains an album or artist name with SPARQL-reserved characters then an error can occur. Consider the following query:

PREFIX rdf: 
PREFIX dbpedia2: 
SELECT str(?cover) 
WHERE {
	?subject dbpedia2:name ?name .
	?subject dbpedia2:artist ?artist .
	?subject rdf:type  .
	?subject dbpedia2:cover ?cover .
	FILTER (regex(?name, "Dvořák: Symphony No. 9 "From the New World""@en, "i") && ?artist="Kirill Kondrashin"@en)
}

This is malformed because the regex expression contains multiple double-quote marks: "Dvořák: Symphony No. 9 "From the New World"". The second quote mark has the effect of terminating the string, meaning the text beginning From is parsed as SPARQL code. This fails:

Virtuoso 37000 Error SP030: SPARQL compiler, line 5: syntax error at 'From' before 'the'

Looking around, I couldn't find a library that performed the escaping for me. So I wrote something simple myself. Feel free to use it in your own projects!

public class SparqlUtils {

	/**
	 * See http://www.w3.org/TR/rdf-sparql-query/#grammarEscapes
	 * @param name
	 * @return
	 */
	private static final Map SPARQL_ESCAPE_SEARCH_REPLACEMENTS = ImmutableMap.builder()
		.put("\t", "\\t")
		.put("\n", "\\n")
		.put("\r", "\\r")
		.put("\b", "\\b")
		.put("\f", "\\f")
		.put("\"", "\\\"")
		.put("'", "\\'")
		.put("\\", "\\\\")
		.build();

	public static String escape(String string) {
		
		StringBuffer bufOutput = new StringBuffer(string);
		for (int i = 0; i < bufOutput.length(); i++) {
			String replacement = SPARQL_ESCAPE_SEARCH_REPLACEMENTS.get("" + bufOutput.charAt(i));
			if(replacement!=null) {
				bufOutput.deleteCharAt(i);
				bufOutput.insert(i, replacement);
				// advance past the replacement
				i += (replacement.length() - 1);
			}
		}
		return bufOutput.toString();
	}
}

This code inspects each character in the input streing and looks for a replacement character. If one exists, it performs the replacement and after all characters have been seen returns the result with all replacements.

Note: the SPARQL_ESCAPE_SEARCH_REPLACEMENTS map uses the Guava ImmutableMap class for building its Map. Replace this with standard core Java API calls if you don't already use Guava (but you should do!)

Thanks to kennymatic for the image above.
comments powered by Disqus
© 2012-2024 elsten software limited, Unit 4934, PO Box 6945, London, W1A 6US, UK | terms and conditions | privacy policy