Saturday, March 25, 2017

Chromium is easy to hack, but hard to write production code

A common paradigm in computer science is that any problem can be solved with another layer of abstraction. A great example is templates. By abstracting the code away from the types it operates on you can write code that works across any type. A direct application of this is Rico Marian's excellent post on a process for templating code to enable unit testing. He later followed this with an example for something tricky, like removing dependencies on global Win32 functions and types. This is a true challenge and he overcame it like a champ. I recommend reading both of those articles, if not now, then queue them up for when you are done.

The transformation he proposed was foreign and made the code look odd which didn't sit well with a lot of developers. This was a tool we would use sparingly rather than adopt at large. Having all of our developers punch through that abstraction moving forward was just too much.

What does this have to do with Chromium? The more I dabble in the Chromium code the more I come across brain melting levels of abstraction. Abstractions are great tools for humans to deal with complexity and hide low level details, but when you have to poke a hole in the abstraction all of this value is lost. You instead have to master all of that complexity, exposed in its full glory, in order to be able to achieve your end goal.

Or you hack it!

The difference between production and hack led to this tweet.


Yes, it does indeed take 20, maybe more, classes to do some things the "right" way. When I was done I still didn't feel like I had chosen the "right" way either since there were other considerations in play between using the legacy or modern marshaling systems which I'll get into later. To get started let me share a diagram I built as I was trying to figure out how to plumb that bool.


I've left out some of the complexity in this case, mostly around the WebKit code starting at WebLocalFrame. I've also left out some of the considerations in dealing with the ContentSettingsObserver object that has a few other classes you can interact with like RendererContentSettingsRules. You can thank me for all of those elided details later.

The hack, if we chose to write it, would be somewhere on the right side of the diagram. The closest we can put the hack is where the flag is needed inside of HTMLExampleElement::doSomething. Imagine that something in this case enables some new VR like feature and that we are going to allow the user to decide if the given site should have access.

This is quite the complicated thing already, even as a hack. But imagine we simplify further and just have a small list of sites we want to allow and it could either live in resources or just be hard coded. To accomplish this we can just write code in a single file. In this case HTMLExampleElement.cc might be the place.

This explains why so many people fork Chromium and embed features. You can be extremely successful with this code if you avoid crossing the abstraction layers. To work in this fashion means that you have to be okay with patching your new code back in each time main moves forward and it moves really, really fast.

In some cases, like this one, following the rules and trying to up-stream might actually be more costly. That is an assessment you would have to do. What would it take and how many abstractions do we have to deal with in order to plumb the values down, say from the Browser, all the way through to the element within the Renderer process?
  1. WebContents will represent your tab/page and is what you'll talk to to push your settings down.
  2. RenderFrameHost is hosted by WebContents and is the IPC bridge to the Renderer.
  3. RenderFrame is the remote end of the IPC channel and represents services provided to a page or document.
  4. WebFrame/WebLocalFrame are the wrappers around each frame (main page and iframe)
  5. ContentSettingsObserver/WebContentSettingsClient is the primary way to turn per page features on or off.
  6. LocalFrameClient is how you talk to your WebFrame to get access to settings.
Wow, 6 major levels of abstraction are present and each is quite different in its preferred mode of operation. It gets more complicated when you consider that some of the abstractions may have more than one implementation within thee code.

Browser to Renderer IPC is the first such item. When we look at the bridge that consists of items 1, 2 and 3 above we are using an IPC::Sender and IPC::Listener. Looking more closely RenderFrameImpl also supports MOJO marshaling. Uh oh, we should probably use MOJO, but IPC is so convenient. I think just pivoting on this point would take some time to decide WHAT we should use. When it comes time to up-stream the code the decision on what path to choose might get reversed. This would obviously slow down development quite a bit.

ContentSettingsObserver and WebContentSettingsClient is the next area with many competing solutions. While common settings like allowAutoplay, allowStorage and allowImage are all in this class there are some things that route differently. Check out allowWebGL which communicates directly to the RenderFrameImpl (using legacy IPC up to the Browser even). Our next decision is whether or not this setting is affected by the host and should we use content setting rules as would be the case with the implementation of allowAutoplay. Right now only a small fraction of settings go through that route, so maybe adding a new concept there would not be easy? Finally, do you need to deal just in the host or do you also need the URL of the resource itself (where the page may be the resource)? You can see how the implementation of allowImage is in this later case.

This gift keeps on giving a you'll notice almost all settings take a default value of some sort and this comes from another object, Settings or even another object, RuntimeEnabledFeatures.

While writing my "production code" version of these changes I came across one last major hurdle. Even though the code on which I'm working is m57 and is fairly new, I found that main had already dramatically changed. In my case, entire classes had disappeared and I'm pretty sure Nasko has something to do with it. Things like FrameLoaderClient, that I was working in to plumb my value were simply gone. This of course would have become another interesting problem when I try to up-stream my changes but would also break anyone who was trying to maintain and apply patches to a forked version of the code.

After all of this I feel confident that I know 15 different ways (and am unaware of even more) to enable/disable features at run-time but I'm left with zero confidence that I know the right way. While I didn't write this article in a particularly helpful way (this is how you do this) I did write it in an informative way (here are all the actors in play). Hopefully you find it somewhat useful or maybe just entertaining. If I ever do figure it all out I'll be sure to let you all know.