Oct 11 2009
Howto: XHR Listening by a Firefox Addon
The following post draws significantly from a post by Jan Odvarko at http://www.softwareishard.com/blog/firebug/nsitraceablechannel-intercept-http-traffic/ but goes a bit further. There are also some sections which were inspired by Firebug, but are heavily modified.
-
What you need to know
Before I get into the code, understand that one of the most important things in this process to understand is that your extension’s listener is just one in a chain. It is the responsibility of every listener in the chain to pass on the information. Failure to do this has some amusing consequences…. like nothing loading in the browser.
Just to make it really clear — Don’t drop the ball. (Edit: And while you can edit the data in the stream — don’t do it unless you have a really good reason.)
-
Convenience methods and aliases
A lot of the Firefox internals are accessed using
Components.classesandComponents.interfaces. While the verbosity makes it clear, it can at times be overly repetitive and, honestly, can take a long time to write out. A fairly common shorthand is in use,CcandCiwith a few other less common shorthands likeCCIN(for creating instances of a class based on a class name and an interface name) andCCSV(similarly creating a service based on a class name and interface name)if (typeof Cc == "undefined") { var Cc = Components.classes; } if (typeof Ci == "undefined") { var Ci = Components.interfaces; } if (typeof CCIN == "undefined") { function CCIN(cName, ifaceName){ return Cc[cName].createInstance(Ci[ifaceName]); } } if (typeof CCSV == "undefined") { function CCSV(cName, ifaceName){ if (Cc[cName]) // if fbs fails to load, the error can be _CC[cName] has no properties return Cc[cName].getService(Ci[ifaceName]); else dumpError("CCSV fails for cName:" + cName); }; }-
What’s with all of the
typeofchecks?Firebug, gotta love it, but it declares the same things using
const. Inside of anif()block, aconstis still seen and conflicts, even when the if condition evaluates tofalse. The code above is essentially a workaround to satisfy both possibilities. If the user has firebug installed, then carry on; if the user doesn’t have firebug installed, declare those shorthands
-
-
The constructor
function TracingListener() { }Above is a (very) simple constructor function for us to create objects from. The methods and properties on the prototype are below. Note that while I could have changed the structure to accommodate better data-hiding, the method below reduces the number of new functions created by making them all declared only once on the prototype. Functions in the constructor are recreated every time the constructor is called with
new yourConstructor()whereas functions on the prototype are shared by all instances. -
The prototype definition
-
Basic properties
TracingListener.prototype = { originalListener: null, receivedData: null, //will be an array for incoming data.The first part of the prototype definition is setting up some basic properties. Note that both are assigned
null. These properties will exist on all instances ofTracingListener, and thus not beundefinedif/when checking. In the case ofreceivedData, do not be tempted to make it an array here. Remember that methods and properties on the prototype are shared by all instances of the same type — and we don’t want all instances to share the same array for data.Also worth note is that
receivedDatais a good candidate for data-hiding and declaring it local to the constructor… but scope and visibility limitations would mean the functions requiring access to it would either need to be in the constructor as well, or have accessor and mutator methods for it. If you’re making a Singleton or a small number of instances, declaring functions in the constructor is no big deal, but this listener will be instantiated hundreds or thousands of times and it’s important to keep the duplication to a minimum. -
Methods on the prototype
-
Interface Requirements
//For the listener this is step 1. onStartRequest: function(request, context) { this.receivedData = []; //initialize the array //Pass on the onStartRequest call to the next listener in the chain -- VERY IMPORTANT this.originalListener.onStartRequest(request, context); },onStartRequestis the first thing called when the actual request processing begins. This is also the best opportunity to initialize the array on this listener.//This is step 2. This gets called every time additional data is available onDataAvailable: function(request, context, inputStream, offset, count) { var binaryInputStream = CCIN("@mozilla.org/binaryinputstream;1", "nsIBinaryInputStream"); binaryInputStream.setInputStream(inputStream); var storageStream = CCIN("@mozilla.org/storagestream;1", "nsIStorageStream"); //8192 is the segment size in bytes, count is the maximum size of the stream in bytes storageStream.init(8192, count, null); var binaryOutputStream = CCIN("@mozilla.org/binaryoutputstream;1", "nsIBinaryOutputStream"); binaryOutputStream.setOutputStream(storageStream.getOutputStream(0)); // Copy received data as they come. var data = binaryInputStream.readBytes(count); this.receivedData.push(data); binaryOutputStream.writeBytes(data, count); //Pass it on down the chain this.originalListener.onDataAvailable(request, context, storageStream.newInputStream(0), offset, count); },onDataAvailableessentially copies the data from thebinaryInputStreamto ourreceivedDataarray and to thestorageStream(via thebinaryOutputStream). Then we pass a newInputStreamfrom ourstorageStreamonto the next listener in the chain.onStopRequest: function(request, context, statusCode) { try { //QueryInterface into HttpChannel to access originalURI and requestMethod properties request.QueryInterface(Ci.nsIHttpChannel); //this is specific to the PirateQuesting Add-on, but is left here as an example of how to modify behaviour based on the requested URL if (request.originalURI && piratequesting.baseURL == request.originalURI.prePath && request.originalURI.path.indexOf("/index.php?ajax=") == 0) { var data = null; if (request.requestMethod.toLowerCase() == "post") { var postText = this.readPostTextFromRequest(request, context); if (postText) data = ((String)(postText)).parseQuery(); } //Combine the response into a single string var responseSource = this.receivedData.join(''); //fix leading spaces bug //(FM occasionally adds spaces to the beginning of their ajax responses... //which breaks the XML) responseSource = responseSource.replace(/^\s+(\S[\s\S]+)/, "$1"); //gets the date from the response headers on the request. //For PirateQuesting this was preferred over the date on the user's machine var date = Date.parse(request.getResponseHeader("Date")); //Again a PQ specific function call, but left as an example. //This just passes a string URL, the text of the response, //the date, and the data in the POST request (if applicable) piratequesting.ProcessRawResponse(request.originalURI.spec, responseSource, date, data); } } catch (e) { //standard function to dump a formatted version of the error to console dumpError(e); } //Pass it on down the chain this.originalListener.onStopRequest(request, context, statusCode); },The
onStopRequestabove has a few tricky parts. The first is theQueryInterfacetonsIHttpChannel– this is critical to getting the info needed. The second tricky part is to get the posted variables. To do so, you need to check that therequestMethodwas indeed post, and then we callreadPostTextFromRequestwhich I’ll introduce in a bit. The last tricky bit is getting the Date header from the response.Date.parse()plays nicely with those (assuming the server response conforms)QueryInterface: function (aIID) { if (aIID.equals(Ci.nsIStreamListener) || aIID.equals(Ci.nsISupports)) { return this; } throw Components.results.NS_NOINTERFACE; },This is pretty standard for anything fulfilling an interface contract for Firefox (or other mozilla-based browsers).
QueryInterfaceis part of thensISupportsinterface and is the only part which is scriptable. All interfaces are derived fromnsISupports, so it has to be there. -
Utility methods
The following methods are required by our TracingListener but are not part of the interface contract. (It would also have been possible to define them globally or within a pseudo-namespace.)
readPostTextFromRequest : function(request, context) { try { var is = request.QueryInterface(Ci.nsIUploadChannel).uploadStream; if (is) { var ss = is.QueryInterface(Ci.nsISeekableStream); var prevOffset; if (ss) { prevOffset = ss.tell(); ss.seek(Ci.nsISeekableStream.NS_SEEK_SET, 0); } // Read data from the stream.. var charset = "UTF-8"; var text = this.readFromStream(is, charset, true); if (ss && prevOffset == 0) ss.seek(Ci.nsISeekableStream.NS_SEEK_SET, 0); return text; } else { dump("Failed to Query Interface for upload stream.\n"); } } catch(exc) { dumpError(exc); } return null; },I will readily admit that
readPostTextFromRequestis mostly taken from Firebug, though there are a few changes. Basically, we have to do the same thing as before andQueryInterfaceinto the appropriate interface. In this case we need nsIUploadChannel to get access touploadStream. And then weQueryInterfacetheuploadStreaminto a nsISeekableStream (noticing a pattern, yet?QueryInterfaceis your best friend.. and worst enemy.). After that we store the original offset in the stream inprevOffset, and then seek to the beginning of the stream. Then we read the data and, if the stream was at position 0 originally, we seek to the beginning again.readFromStream : function(stream, charset, noClose) { var sis = CCSV("@mozilla.org/binaryinputstream;1", "nsIBinaryInputStream"); sis.setInputStream(stream); var segments = []; for (var count = stream.available(); count; count = stream.available()) segments.push(sis.readBytes(count)); if (!noClose) sis.close(); var text = segments.join(""); return text; } }readFromStreamis also largely from Firebug with a few modifications. It is however remarkably similar to what is done inonDataAvailableandonStopRequest. Basically, we get aBinaryInputStreamto work with the stream given. Then we loop through the segments of the stream (size provided byavailable()) and add them to an array. When finished with that, we join the segments and return the text.httpRequestObserver = { observe: function(request, aTopic, aData){ if (typeof Cc == "undefined") { var Cc = Components.classes; } if (typeof Ci == "undefined") { var Ci = Components.interfaces; } if (aTopic == "http-on-examine-response") { request.QueryInterface(Ci.nsIHttpChannel); if (request.originalURI && piratequesting.baseURL == request.originalURI.prePath && request.originalURI.path.indexOf("/index.php?ajax=") == 0) { var newListener = new TracingListener(); request.QueryInterface(Ci.nsITraceableChannel); newListener.originalListener = request.setNewListener(newListener); } } }, QueryInterface: function(aIID){ if (typeof Cc == "undefined") { var Cc = Components.classes; } if (typeof Ci == "undefined") { var Ci = Components.interfaces; } if (aIID.equals(Ci.nsIObserver) || aIID.equals(Ci.nsISupports)) { return this; } throw Components.results.NS_NOINTERFACE; }, };This part is fairly straightforward. The object httpRequestObserver has to fulfill the contract for the nsIObserver interface — which only has two methods: observe and QueryInterface.
-
-
-
Observer registration
Finally, we need to register the observer:
var observerService = Cc["@mozilla.org/observer-service;1"] .getService(Ci.nsIObserverService); observerService.addObserver(httpRequestObserver, "http-on-examine-response", false);Now the
observerServicewill call theobservemethod onhttpRequestObserverwhenever it notifies observers with thehttp-on-examine-responsetopic.When you want to unregister the observer, use:
observerService.removeObserver(httpRequestObserver, "http-on-examine-response");As you can see, getting the text and post variables from an http request is non-trivial.
Note, though, that this code does not check the context to determine whether the http request is for a browser window, or from a browser window so depending on the complexity of your situation, you may want to do that as well. Perhaps, I’ll add that in another post.
(See Firebug license here. Special thanks to the Firebug team and to Jon Odvarko for providing so much useful material. The interface docs at oxymoronical are a great resource. The Mozilla Developer Center also deserves special credit for great documentation. )
Update (Jan 17, 2010): Corrected a small bug in onStopRequest (Thanks Broady!). See below for details.
Update (April 22, 2010): Corrected a bug which doesn’t occur if Firebug is installed (Thanks Harini!). See below for details.

October 12th, 2009 at 3:35 pm
[...] on presenting how to get the window from which a request originates. Yesterday’s post about XHR Listening by a Firefox Addon gives a good basis to work from so I will assume you’ve read over that and understood it (you [...]
October 17th, 2009 at 5:14 pm
[...] post about XHR Listening by a Firefox Addon gives a good basis to work from so I will assume you’ve read over that and understood it (you [...]
October 25th, 2009 at 1:48 am
Hi,
I need to get the “post” request data onRequestStart or onDataAvailble of TracingListener but when i do this it seems as if the httpchannel is closed and i get no response data. It seems to be working fine when i read the request on onRequestStop. Can you help me with this or with an alternate solution?
October 25th, 2009 at 5:46 am
Sorry about the previous post. It looks like a typo error
October 25th, 2009 at 8:45 am
No worries. One good option is to get the post data and save it as a property on the channel when listening to http-on-modify-request. That way you don’t have to worry about requesting it again and again.
January 17th, 2010 at 9:49 pm
Thank You for this great article – it was really useful for me. I’m playing with this code for a while and I found a bug. There is:
var responseSource = this.receivedData.join();
And in a fact there should be
var responseSource = this.receivedData.join(”);
Because the default separator for join is a coma ‘,’ ! So if You experienced problems with big files – that’s probably the bug :)
January 30th, 2010 at 4:02 pm
Any chance on getting something packaged as an extension? I’ve been trying to get this to work, but I just can’t figure out wtc I’m doing wrong.
January 30th, 2010 at 10:23 pm
@MeatPopsicle This is currently working in the piratequesting add-on. You can download that from http://pq.ashita.org. Just change the .xpi extenstion to .zip and you’re good to go. Note that the current release doesn’t work with the latest FF (new release soon) but all the code is there to make it work.
January 31st, 2010 at 1:21 pm
Thanks, been trying to get a datamining app done for another browser game, it uses AJAX calls to pull in market data.
February 28th, 2010 at 11:13 am
I tried modifying the response as follows, but it did not work. Any ideas guys?
var moredata = ”.getBytes();
var binaryInputStream = CCIN(”@mozilla.org/binaryinputstream;1″,
“nsIBinaryInputStream”);
var storageStream = CCIN(”@mozilla.org/storagestream;1″, “nsIStorageStream”);
var binaryOutputStream = CCIN(”@mozilla.org/binaryoutputstream;1″,
“nsIBinaryOutputStream”);
binaryInputStream.setInputStream(inputStream);
storageStream.init(8192, count + moredata.length, null);
binaryOutputStream.setOutputStream(storageStream.getOutputStream(0));
// Copy received data as they come.
var data = binaryInputStream.readBytes(count);
this.receivedData.push(data);
binaryOutputStream.writeBytes(data, count);
binaryOutputStream.writeBytes(moredata, moredata.length);
this.originalListener.onDataAvailable(request, context,
storageStream.newInputStream(0), offset, count + moredata.length);
February 28th, 2010 at 11:15 am
The website cut the value of moredata, should look like:
var moredata = ’some javascript or HTML code’.getBytes();
February 28th, 2010 at 11:40 am
I just did it, had to remove getBytes() :)
March 2nd, 2010 at 5:49 am
Thanks Jonathan, this is a great article.
What is piratequesting object? I’m trying to run this example, but i get an error.
Best regards.
March 2nd, 2010 at 6:08 am
@Maurico
piratequesting is an add-on I made for the browser game at http://www.piratequest.net. You can download the add-on at pq.ashita.org and test it out. Unfortunately I haven’t updated it in a while and it is no longer compatible with the current version of Firefox. If you turn off version compatibility checking it works just fine.
April 22nd, 2010 at 9:54 pm
Hi Jonathan,
Tnx for this article. I got my extension to listen in on XHR request using this code but I have a problem, the code only works if firebug is installed! Without firebug the onStart, onDataAvailable and onStopRequest all get value as ‘undefined’ for their request parameter(i didn’t check for other parameters). But whenever firebug is also installed in the browser the request value gets passed and everything works.
Can you pls let me how i can fix this?
tnx
Harini
April 23rd, 2010 at 1:17 am
Harini,
Good catch. I had fixed this bug in PirateQuesting several months back but obviously I’d forgotten to update this blog entry.
Needed to add:
if (typeof Cc == "undefined") {
var Cc = Components.classes;
}
if (typeof Ci == "undefined") {
var Ci = Components.interfaces;
}
to the httpRequestObserver methods as they operate in a different scope than TracingListener. Firebug’s const values were still valid though and the oversight was missed in testing.
June 27th, 2010 at 10:20 am
This is exactly what I was looking for!
June 30th, 2010 at 12:37 pm
Hi,
this is a great site. I have to bookmark this one now as Honza has closed the comments section (for no obvious reason, whatsoever!)
Well then.
I wonder why you need a BINARY input stream for all that?
Supposed I want to use this code as a basis to transform a web page “on the fly” before the browser can get hold of it, e. g. to remove over-eager javascripts or even only a few JS _lines_.
Supposed too that the stream is nothing but HTML+JS, it would be great to be able to use a DOM function like getElementsByTagName() to get some nodes *before* the browser can do anything with the page.
AFAIK this is not possible with a binary input stream.
So am I doomed to use the BIS or can I also use a sort of “character input stream”?
July 1st, 2010 at 1:42 am
@Andy,
What you *could* try doing is not passing things on down the listener chain until the very end. Do your processing on the completed string/page and then pass the modified version on to the other listeners.
To perform normal DOM methods, you can load the content into a virtual iframe:
function createDoc (htmlText) {
_iframe.docShell.allowJavascript = false;
_iframe.docShell.allowAuth = false;
_iframe.docShell.allowPlugins = false;
_iframe.docShell.allowMetaRedirects = false;
_iframe.docShell.allowSubframes = false;
_iframe.docShell.allowImages = false;
var doc = _iframe.contentDocument;
var strip = /([\s\S]*?)< \/html>/i;
if (strip.test(htmlText)) {
doc.getElementsByTagName(”html”)[0].innerHTML = strip.exec(htmlText)[1];
}
return doc;
}
And then just get the content back out again when you want to pass it on to the next listener
July 1st, 2010 at 9:11 pm
@ Jonathan,
thanks for your reply, and your sample code. However, you’re one step ahead :)
I’m still in onDataAvailable():
var data = binaryInputStream.readBytes(count);
At first, I’ll need the step how to get the binaryInputStream to something better parseable.
July 3rd, 2010 at 4:46 am
@Andy,
The binary input stream’s readBytes() method is just a string of bytes… you *could* parse it–and manipulate it–but there are too many potential problems.
Don’t try to parse until the download is complete. Let’s say the stream comes in two parts… You can’t create a parse-able document from that. You’ll have unclosed tags, parts of multi-byte characters, etc.
By the way… in all honesty, it’s probably better to modify the document after it has loaded. Delaying document loading is definitely unexpected behaviour.
July 7th, 2010 at 7:13 am
@Johnathan,
well you do have a point here, but … to get rid of over-eager javascripts I MUST modify the document before the browser sees it (if this was what you mean, that is). The JS interpreters embedded in various browsers will always *pre-process* all inline JS, so any changes after loading the full document will get dismissed. (Trust me, I had to go through all this first to realize that my attempts were fruitless :))