Oct 11 2009

Howto: XHR Listening by a Firefox Addon

Category: Firefox, JavaScript, Pirate QuestingJonathan Fingland @ 4:00 am

The following post draws significantly from a post by Jan Odvarko at http://www.softwareishard.com/blog/firebug/nsitraceablechannel-intercept-http-traffic/ but goes a bit further. There are also some sections which were inspired by Firebug, but are heavily modified.

  • What you need to know

    Before I get into the code, understand that one of the most important things in this process to understand is that your extension’s listener is just one in a chain. It is the responsibility of every listener in the chain to pass on the information. Failure to do this has some amusing consequences…. like nothing loading in the browser.

    Just to make it really clear — Don’t drop the ball. (Edit: And while you can edit the data in the stream — don’t do it unless you have a really good reason.)

  • Convenience methods and aliases

    A lot of the Firefox internals are accessed using Components.classes and Components.interfaces. While the verbosity makes it clear, it can at times be overly repetitive and, honestly, can take a long time to write out. A fairly common shorthand is in use, Cc and Ci with a few other less common shorthands like CCIN (for creating instances of a class based on a class name and an interface name) and CCSV (similarly creating a service based on a class name and interface name)

    
    if (typeof Cc == "undefined") {
    	var Cc = Components.classes;
    }
    if (typeof Ci == "undefined") {
            var Ci = Components.interfaces;
    }
    if (typeof CCIN == "undefined") {
    	function CCIN(cName, ifaceName){
    		return Cc[cName].createInstance(Ci[ifaceName]);
    	}
    }
    if (typeof CCSV == "undefined") {
    	function CCSV(cName, ifaceName){
    		if (Cc[cName])
    			// if fbs fails to load, the error can be _CC[cName] has no properties
    			return Cc[cName].getService(Ci[ifaceName]);
    		else
    			dumpError("CCSV fails for cName:" + cName);
    	};
    }
    • What’s with all of the typeof checks?

      Firebug, gotta love it, but it declares the same things using const. Inside of an if() block, a const is still seen and conflicts, even when the if condition evaluates to false. The code above is essentially a workaround to satisfy both possibilities. If the user has firebug installed, then carry on; if the user doesn’t have firebug installed, declare those shorthands

  • The constructor

    function TracingListener() {
    }

    Above is a (very) simple constructor function for us to create objects from. The methods and properties on the prototype are below. Note that while I could have changed the structure to accommodate better data-hiding, the method below reduces the number of new functions created by making them all declared only once on the prototype. Functions in the constructor are recreated every time the constructor is called with new yourConstructor() whereas functions on the prototype are shared by all instances.

  • The prototype definition

    • Basic properties

      TracingListener.prototype =
      {
          originalListener: null,
          receivedData: null,   //will be an array for incoming data.
      

      The first part of the prototype definition is setting up some basic properties. Note that both are assigned null. These properties will exist on all instances of TracingListener, and thus not be undefined if/when checking. In the case of receivedData, do not be tempted to make it an array here. Remember that methods and properties on the prototype are shared by all instances of the same type — and we don’t want all instances to share the same array for data.

      Also worth note is that receivedData is a good candidate for data-hiding and declaring it local to the constructor… but scope and visibility limitations would mean the functions requiring access to it would either need to be in the constructor as well, or have accessor and mutator methods for it. If you’re making a Singleton or a small number of instances, declaring functions in the constructor is no big deal, but this listener will be instantiated hundreds or thousands of times and it’s important to keep the duplication to a minimum.

    • Methods on the prototype

      • Interface Requirements
            //For the listener this is step 1.
            onStartRequest: function(request, context) {
            	this.receivedData = []; //initialize the array
        
        	//Pass on the onStartRequest call to the next listener in the chain -- VERY IMPORTANT
        	this.originalListener.onStartRequest(request, context);
            },

        onStartRequest is the first thing called when the actual request processing begins. This is also the best opportunity to initialize the array on this listener.

            //This is step 2. This gets called every time additional data is available
            onDataAvailable: function(request, context, inputStream, offset, count)
            {
               var binaryInputStream = CCIN("@mozilla.org/binaryinputstream;1",
                                         "nsIBinaryInputStream");
                binaryInputStream.setInputStream(inputStream);
        
                var storageStream = CCIN("@mozilla.org/storagestream;1",
                                         "nsIStorageStream");
                //8192 is the segment size in bytes, count is the maximum size of the stream in bytes
                storageStream.init(8192, count, null); 
        
        	var binaryOutputStream = CCIN("@mozilla.org/binaryoutputstream;1",
                                         "nsIBinaryOutputStream");
                binaryOutputStream.setOutputStream(storageStream.getOutputStream(0));
        
                // Copy received data as they come.
                var data = binaryInputStream.readBytes(count);
        
                this.receivedData.push(data);
        
                binaryOutputStream.writeBytes(data, count);
        
                //Pass it on down the chain
                this.originalListener.onDataAvailable(request,
                                                  context,
                                                  storageStream.newInputStream(0),
                                                  offset,
                                                  count);
            },

        onDataAvailable essentially copies the data from the binaryInputStream to our receivedData array and to the storageStream (via the binaryOutputStream). Then we pass a new InputStream from our storageStream onto the next listener in the chain.

            onStopRequest: function(request, context, statusCode)
            {
        	try
        	{
                        //QueryInterface into HttpChannel to access originalURI and requestMethod properties
        		request.QueryInterface(Ci.nsIHttpChannel);
        
                        //this is specific to the PirateQuesting Add-on, but is left here as an example of how to modify behaviour based on the requested URL
        		if (request.originalURI
                            && piratequesting.baseURL == request.originalURI.prePath
                            && request.originalURI.path.indexOf("/index.php?ajax=") == 0)
        		{
        
        			var data = null;
        			if (request.requestMethod.toLowerCase() == "post")
        			{
        				var postText = this.readPostTextFromRequest(request, context);
        				if (postText)
        					data = ((String)(postText)).parseQuery();
        
        			}
        
                                //Combine the response into a single string
        			var responseSource = this.receivedData.join('');
        
        			//fix leading spaces bug
        			//(FM occasionally adds spaces to the beginning of their ajax responses...
                                //which breaks the XML)
        			responseSource = responseSource.replace(/^\s+(\S[\s\S]+)/, "$1");
        
                                //gets the date from the response headers on the request.
                                //For PirateQuesting this was preferred over the date on the user's machine
        			var date = Date.parse(request.getResponseHeader("Date"));
        
                                //Again a PQ specific function call, but left as an example.
                                //This just passes a string URL, the text of the response,
                                //the date, and the data in the POST request (if applicable)
        			piratequesting.ProcessRawResponse(request.originalURI.spec,
                                                       responseSource,
                                                       date,
                                                       data);
        		}
        	}
        	catch (e)
        	{
        		//standard function to dump a formatted version of the error to console
        		dumpError(e);
        	}
        	//Pass it on down the chain
        	this.originalListener.onStopRequest(request,
                                                 context,
                                                 statusCode);
            },

        The onStopRequest above has a few tricky parts. The first is the QueryInterface to nsIHttpChannel – this is critical to getting the info needed. The second tricky part is to get the posted variables. To do so, you need to check that the requestMethod was indeed post, and then we call readPostTextFromRequest which I’ll introduce in a bit. The last tricky bit is getting the Date header from the response. Date.parse() plays nicely with those (assuming the server response conforms)

            QueryInterface: function (aIID) {
                if (aIID.equals(Ci.nsIStreamListener) ||
                    aIID.equals(Ci.nsISupports)) {
                    return this;
                }
                throw Components.results.NS_NOINTERFACE;
            },

        This is pretty standard for anything fulfilling an interface contract for Firefox (or other mozilla-based browsers). QueryInterface is part of the nsISupports interface and is the only part which is scriptable. All interfaces are derived from nsISupports, so it has to be there.

      • Utility methods

        The following methods are required by our TracingListener but are not part of the interface contract. (It would also have been possible to define them globally or within a pseudo-namespace.)

            readPostTextFromRequest : function(request, context) {
                try
                {
        	        var is = request.QueryInterface(Ci.nsIUploadChannel).uploadStream;
        	        if (is)
        	        {
        	            var ss = is.QueryInterface(Ci.nsISeekableStream);
        	            var prevOffset;
        	            if (ss)
        	            {
        	                prevOffset = ss.tell();
        	                ss.seek(Ci.nsISeekableStream.NS_SEEK_SET, 0);
        	            }
        
        	            // Read data from the stream..
        		    var charset = "UTF-8";
        		    var text = this.readFromStream(is, charset, true);
        
        	            if (ss && prevOffset == 0)
        	                ss.seek(Ci.nsISeekableStream.NS_SEEK_SET, 0);
        
        	            return text;
        	        }
        		else {
        			dump("Failed to Query Interface for upload stream.\n");
        		}
        	    }
        	    catch(exc)
        	    {
        			dumpError(exc);
        	    }
        
        	    return null;
        	},

        I will readily admit that readPostTextFromRequest is mostly taken from Firebug, though there are a few changes. Basically, we have to do the same thing as before and QueryInterface into the appropriate interface. In this case we need nsIUploadChannel to get access to uploadStream. And then we QueryInterface the uploadStream into a nsISeekableStream (noticing a pattern, yet? QueryInterface is your best friend.. and worst enemy.). After that we store the original offset in the stream in prevOffset, and then seek to the beginning of the stream. Then we read the data and, if the stream was at position 0 originally, we seek to the beginning again.

        	readFromStream : function(stream, charset, noClose)	{
        
        	    var sis = CCSV("@mozilla.org/binaryinputstream;1",
                                    "nsIBinaryInputStream");
        	    sis.setInputStream(stream);
        
        	    var segments = [];
        	    for (var count = stream.available(); count; count = stream.available())
        	        segments.push(sis.readBytes(count));
        
        	    if (!noClose)
        	        sis.close();
        
        	    var text = segments.join("");
        	    return text;
        	}
        
        }

        readFromStream is also largely from Firebug with a few modifications. It is however remarkably similar to what is done in onDataAvailable and onStopRequest. Basically, we get a BinaryInputStream to work with the stream given. Then we loop through the segments of the stream (size provided by available()) and add them to an array. When finished with that, we join the segments and return the text.

        httpRequestObserver = {
        
        	observe: function(request, aTopic, aData){
        		if (typeof Cc == "undefined") {
        			var Cc = Components.classes;
        		}
        		if (typeof Ci == "undefined") {
        			var Ci = Components.interfaces;
        		}
        	    	if (aTopic == "http-on-examine-response") {
        	    		request.QueryInterface(Ci.nsIHttpChannel);
        
        			if (request.originalURI
                                    && piratequesting.baseURL == request.originalURI.prePath
                                    && request.originalURI.path.indexOf("/index.php?ajax=") == 0) {
        				var newListener = new TracingListener();
            				request.QueryInterface(Ci.nsITraceableChannel);
            				newListener.originalListener = request.setNewListener(newListener);
        			}
        		}
        	},
        
        	QueryInterface: function(aIID){
        		if (typeof Cc == "undefined") {
        			var Cc = Components.classes;
        		}
        		if (typeof Ci == "undefined") {
        			var Ci = Components.interfaces;
        		}
        		if (aIID.equals(Ci.nsIObserver) ||
        		aIID.equals(Ci.nsISupports)) {
        			return this;
        		}
        
        		throw Components.results.NS_NOINTERFACE;
        
        	},
        };

        This part is fairly straightforward. The object httpRequestObserver has to fulfill the contract for the nsIObserver interface — which only has two methods: observe and QueryInterface.

  • Observer registration

    Finally, we need to register the observer:

    var observerService = Cc["@mozilla.org/observer-service;1"]
        .getService(Ci.nsIObserverService);
    
    observerService.addObserver(httpRequestObserver,
        "http-on-examine-response", false);

    Now the observerService will call the observe method on httpRequestObserver whenever it notifies observers with the http-on-examine-response topic.

    When you want to unregister the observer, use:

    observerService.removeObserver(httpRequestObserver,
        "http-on-examine-response");

    As you can see, getting the text and post variables from an http request is non-trivial.

Note, though, that this code does not check the context to determine whether the http request is for a browser window, or from a browser window so depending on the complexity of your situation, you may want to do that as well. Perhaps, I’ll add that in another post.

(See Firebug license here. Special thanks to the Firebug team and to Jon Odvarko for providing so much useful material. The interface docs at oxymoronical are a great resource. The Mozilla Developer Center also deserves special credit for great documentation. )

Update (Jan 17, 2010): Corrected a small bug in onStopRequest (Thanks Broady!). See below for details.

Update (April 22, 2010): Corrected a bug which doesn’t occur if Firebug is installed (Thanks Harini!). See below for details.

Tags: , , ,

22 Responses to “Howto: XHR Listening by a Firefox Addon”

  1. Ashita.org » Getting the source window of a request says:

    [...] on presenting how to get the window from which a request originates. Yesterday’s post about XHR Listening by a Firefox Addon gives a good basis to work from so I will assume you’ve read over that and understood it (you [...]

  2. Ashita.org » Getting the source window of a request says:

    [...] post about XHR Listening by a Firefox Addon gives a good basis to work from so I will assume you’ve read over that and understood it (you [...]

  3. anupbasil says:

    Hi,
    I need to get the “post” request data onRequestStart or onDataAvailble of TracingListener but when i do this it seems as if the httpchannel is closed and i get no response data. It seems to be working fine when i read the request on onRequestStop. Can you help me with this or with an alternate solution?

  4. anupbasil says:

    Sorry about the previous post. It looks like a typo error

  5. Jonathan Fingland says:

    No worries. One good option is to get the post data and save it as a property on the channel when listening to http-on-modify-request. That way you don’t have to worry about requesting it again and again.

  6. Broady says:

    Thank You for this great article – it was really useful for me. I’m playing with this code for a while and I found a bug. There is:

    var responseSource = this.receivedData.join();

    And in a fact there should be

    var responseSource = this.receivedData.join(”);

    Because the default separator for join is a coma ‘,’ ! So if You experienced problems with big files – that’s probably the bug :)

  7. MeatPopsicle says:

    Any chance on getting something packaged as an extension? I’ve been trying to get this to work, but I just can’t figure out wtc I’m doing wrong.

  8. Jonathan Fingland says:

    @MeatPopsicle This is currently working in the piratequesting add-on. You can download that from http://pq.ashita.org. Just change the .xpi extenstion to .zip and you’re good to go. Note that the current release doesn’t work with the latest FF (new release soon) but all the code is there to make it work.

  9. MeatPopsicle says:

    Thanks, been trying to get a datamining app done for another browser game, it uses AJAX calls to pull in market data.

  10. Alexi Jordanov says:

    I tried modifying the response as follows, but it did not work. Any ideas guys?

    var moredata = ”.getBytes();

    var binaryInputStream = CCIN(”@mozilla.org/binaryinputstream;1″,
    “nsIBinaryInputStream”);
    var storageStream = CCIN(”@mozilla.org/storagestream;1″, “nsIStorageStream”);
    var binaryOutputStream = CCIN(”@mozilla.org/binaryoutputstream;1″,
    “nsIBinaryOutputStream”);

    binaryInputStream.setInputStream(inputStream);
    storageStream.init(8192, count + moredata.length, null);
    binaryOutputStream.setOutputStream(storageStream.getOutputStream(0));

    // Copy received data as they come.
    var data = binaryInputStream.readBytes(count);
    this.receivedData.push(data);

    binaryOutputStream.writeBytes(data, count);
    binaryOutputStream.writeBytes(moredata, moredata.length);

    this.originalListener.onDataAvailable(request, context,
    storageStream.newInputStream(0), offset, count + moredata.length);

  11. Alexi Jordanov says:

    The website cut the value of moredata, should look like:
    var moredata = ’some javascript or HTML code’.getBytes();

  12. Alexi Jordanov says:

    I just did it, had to remove getBytes() :)

  13. Mauricio Gaueca F. says:

    Thanks Jonathan, this is a great article.
    What is piratequesting object? I’m trying to run this example, but i get an error.

    Best regards.

  14. Jonathan Fingland says:

    @Maurico

    piratequesting is an add-on I made for the browser game at http://www.piratequest.net. You can download the add-on at pq.ashita.org and test it out. Unfortunately I haven’t updated it in a while and it is no longer compatible with the current version of Firefox. If you turn off version compatibility checking it works just fine.

  15. Harini says:

    Hi Jonathan,

    Tnx for this article. I got my extension to listen in on XHR request using this code but I have a problem, the code only works if firebug is installed! Without firebug the onStart, onDataAvailable and onStopRequest all get value as ‘undefined’ for their request parameter(i didn’t check for other parameters). But whenever firebug is also installed in the browser the request value gets passed and everything works.

    Can you pls let me how i can fix this?

    tnx
    Harini

  16. Jonathan Fingland says:

    Harini,

    Good catch. I had fixed this bug in PirateQuesting several months back but obviously I’d forgotten to update this blog entry.

    Needed to add:


    if (typeof Cc == "undefined") {
    var Cc = Components.classes;
    }
    if (typeof Ci == "undefined") {
    var Ci = Components.interfaces;
    }

    to the httpRequestObserver methods as they operate in a different scope than TracingListener. Firebug’s const values were still valid though and the oversight was missed in testing.

  17. designer mode says:

    This is exactly what I was looking for!

  18. Andy E says:

    Hi,

    this is a great site. I have to bookmark this one now as Honza has closed the comments section (for no obvious reason, whatsoever!)
    Well then.

    I wonder why you need a BINARY input stream for all that?
    Supposed I want to use this code as a basis to transform a web page “on the fly” before the browser can get hold of it, e. g. to remove over-eager javascripts or even only a few JS _lines_.

    Supposed too that the stream is nothing but HTML+JS, it would be great to be able to use a DOM function like getElementsByTagName() to get some nodes *before* the browser can do anything with the page.
    AFAIK this is not possible with a binary input stream.

    So am I doomed to use the BIS or can I also use a sort of “character input stream”?

  19. Jonathan Fingland says:

    @Andy,

    What you *could* try doing is not passing things on down the listener chain until the very end. Do your processing on the completed string/page and then pass the modified version on to the other listeners.

    To perform normal DOM methods, you can load the content into a virtual iframe:

    function createDoc (htmlText) {
    _iframe.docShell.allowJavascript = false;
    _iframe.docShell.allowAuth = false;
    _iframe.docShell.allowPlugins = false;
    _iframe.docShell.allowMetaRedirects = false;
    _iframe.docShell.allowSubframes = false;
    _iframe.docShell.allowImages = false;

    var doc = _iframe.contentDocument;
    var strip = /([\s\S]*?)< \/html>/i;
    if (strip.test(htmlText)) {
    doc.getElementsByTagName(”html”)[0].innerHTML = strip.exec(htmlText)[1];
    }

    return doc;
    }

    And then just get the content back out again when you want to pass it on to the next listener

  20. Andy E says:

    @ Jonathan,

    thanks for your reply, and your sample code. However, you’re one step ahead :)

    I’m still in onDataAvailable():
    var data = binaryInputStream.readBytes(count);

    At first, I’ll need the step how to get the binaryInputStream to something better parseable.

  21. Jonathan Fingland says:

    @Andy,

    The binary input stream’s readBytes() method is just a string of bytes… you *could* parse it–and manipulate it–but there are too many potential problems.

    Don’t try to parse until the download is complete. Let’s say the stream comes in two parts… You can’t create a parse-able document from that. You’ll have unclosed tags, parts of multi-byte characters, etc.

    By the way… in all honesty, it’s probably better to modify the document after it has loaded. Delaying document loading is definitely unexpected behaviour.

  22. Andy E says:

    @Johnathan,

    well you do have a point here, but … to get rid of over-eager javascripts I MUST modify the document before the browser sees it (if this was what you mean, that is). The JS interpreters embedded in various browsers will always *pre-process* all inline JS, so any changes after loading the full document will get dismissed. (Trust me, I had to go through all this first to realize that my attempts were fruitless :))

Leave a Reply