Blog
2008 02 posts (3)
Ordering: Ascending Descending
1. Data portability
2008-02-05 12:33:27 by Martynas Jusevičius
Recently, a number of initiatives complaining about Web applications being built as walled gardens and not allowing users to control their data or transfer it across to another application started showing up, especially in the social networks area: Data Portability, Open Social Web, OpenID. They demand ownership and control over profiles and relationships, and publishing of them using open standards.
Although many Web 2.0 websites started offering APIs, they do not solve the problem completely. Most of them are based on XML, which is machine-readable to great advantage, but still leads to the N^2 problem — in order to be integrated, each pair of applications has to be programed accordingly, with the knowledge of API and formats on the other end.
The true solution here might also become a bootstrap for the Semantic Web, which is not about some kind of Artificial Intelligence right now, but about data integration in the first place. In fact, the Web 2.0 data portability initiatives resemble the semantic Linking Open Data community project a lot. To support true data portability, Web applications should publish their data as RDF Linked Data, and ultimately provide SPARQL query endpoints and employ OWL ontologies.
2. RESTful cache
2008-02-15 13:08:41 by Martynas Jusevičius
We've been recently thinking about how to implement a cache over the DIY Framework. Ideally it should be an extra layer in the application and not require making changes in the underlying design.
The golden rule of caching says, that it is best to cache as close to the final product as possible. It is good to cache a result set, but best best to cache the whole webpage. So we'll focus on that, since it is also easier to implement as a separate layer.
Imagine a product webpage. There a couple of forms on it (e. g. for comments), and the page is only updated when they are submitted. Otherwise the content stays the same, so it can be cached and served until the page is updated again. The important thing is to know when the update happens and when to invalidate the cache.
Unfortunately (or not), there are web pages that are not updated directly via HTTP methods. They change because other pages get updated. Imagine a list of most recent product comments. It would take some logic to figure out when it was updated — at least retrieving the timestamp for the most recent comment. It makes cache invalidation hard.
The benefit of REST architecture and our framework is that resources are fine-grained. If there is a resource with a URI Products/123 and it received a POST (or basically any non-GET) request, we can assume it was updated. It would be harder to figure out in a non-RESTful design, for example if all the product requests would be handled by a single script and URIs would be something like products.php?id=123.
Making all forms submit to the same URI as they are served on seems also to be a good practice. If a product comment would be submitted to some comment.php instead, the cache would not know that the product page was updated.
Now we need to figure out how to implement this using some memory cache (such as eAccelerator) and sending correct Last-Modified and Cache-Control headers :)
3. The Rule of customization
2008-02-23 13:51:57 by Martynas Jusevičius
Whenever I need to use complex GUIs that are supposed to hide the complexities of the code or generate it for you as a convenience, or complex configuration files, I get the feeling that something is not right, that they actually stand in the way of doing my work rather than helping me.
One example could be ASP.NET in Visual Studio and that kind of interfaces. A few clicks in a wizard allows you to bind the database and show a table of data on your page. You can of course add or remove columns, change styles and appearance etc. But since you do not actually control the code behind it, and if you are doing something more advanced, you hit the wall eventually since there is just no way to do it using the interface. Then you still need go down to hack the code. And in the meanwhile you have been learning all these knobs and buttons of a proprietary program instead of using and extending your knowledge of SQL or HTML.
Another example I can think of is huge declarative configuration files, usually written in XML. They are common in Web frameworks such as Struts, but probably elsewhere, too. They were probably built to make things simpler and just hold some constants, but then got out of hand and blew up. At some point it might seem that you are configuring more than coding, and still are not able to achieve what you want. And again, to do that you probably had to figure out a whole specification of a custom XML schema that you will not be able to use somewhere else, instead of sticking to your plain old Java code.
I am not saying customizable interfaces and configuration files are useless, but I think this rule applies:
At some point, customizable tools which are meant to ease the software development become so complex that it takes more effort to figure them out and customize them for your needs rather than build what you want from scratch.
Ordering: Ascending Descending
