Author: Roman Rytov
Introduction
Once a friend of mine showed me a game applet of a very popular game site. The applet listed users’ nicknames being logged in a game room and the list might be pretty long (about a couple of hundreds). The friend asked me if I can write down the nicknames from the applet and if so, do it, of course, automatically. The task became more challenging after she mentioned that there are hundreds of such rooms on the site but they all have very similar presentation structure. Besides that, for every room there was a corresponding HTML page roughly presenting a number of logged in users.
Thought there were nothing falling under concern of hacking or damaging the system a certain smack of adventurism, from a technical point of view, of course, allured me. I embarked with bright desire in this challenge. I had immediately several ideas how to get to the list and started speculate about each of them.
Prima facie - through the network
This way actually means sniffing of the network. It requires putting a sniffer on the port the applet using and decode incoming packets. Besides the fact that this approach comes to one’s mind ahead of other ideas it possesses only flaws. Firstly it’s suitable merely for not secure connection. Secondly parsing of collected packets is not so reliable since it has to be adopted after any change in a protocol (whether it’s a proprietary one or RMI). Thirdly implementation of the sniffer itself looks very laborious since the sniffer must have some API to know when to start and to finish capturing and how to filter incoming streams.
After a deep sleep - through the JVM
Sounds a bit eccentrically but still practical and feasible. The approach assumes to assemble own JVM (based on one of the existing ones) by substituting some core classes, for instance some classes from java.io package. This way would allow to override a certain functionality of an existing class, let’s say java.io.DataInputStream, by adding own logic to it. The additional functionality should simply double all input data to another place.
Although this way looks more practical then the first one a number of obstacles arose. An applet may be run through a browser or through an appletviewer. Browsers do not allow to choose JVM that leaves us the appletviewer’s direction. The structure of init parameters varied from room to room a lot that ledme to decline this approach also but not marking it as a dead-end one.
Final plan – break through the applet itself
Reconnaissance in force
The idea is to change the code of the applet availing the fact that its code is being cached on the client side. Here I needed nor to change a JVM neither to worry about init parameters (all entry HTML pages to rooms are also cached including cookies). Even to change the computer is not a problem in such an approach. All is needed in this case is to replace the applet in the cache! Here I’ll describe in more details the whole algorithm of this approach.
The original applet was signed and after installation it’s cached on the disk. Then I could decompile, change and compile back the applet’s code as I wished unless a more newer version of the applet replaced the cached code. The idea sounds, actually, simpler that it was in reality. The first trouble was obfuscation. Obfuscation is a process of polluting Java byte code by replacing class, function and member names to some meaningless strings making its decompiled source completely not readable. In my case the applet package contained about 200 classes and they were not in a form ready for decompilation. I mean that I could decompile it but the result would be like
package com.for.double;
public class 194 {
23 35x_ = new 73();
public int 87z (){}’
...
}
as you may notice the decompilated code more often then not is comprised of prohibited keywords and its class names start with digits (all bad names are bolded int the snippet) – a fact not allowing to recompile the decompiled source back to bytecode.
My solution of this problem was pretty simple and clean. I edited all the binary code temporarily by changing bad names to legal ones. The workaround is to change leading digits in class names to letters, i.e. classes 1* (10, 11, 12...100, 101 ...) become a* (a0, a1, a2...a00, a01) and 2* (24, 28,...201) go to b* (b4, b8,...b01). I made this transformation for all other letters. It’s important no to forget to rename classes and packages accordingly. Bad package names or functions I had to change manually but fortunately there were only a few ones.
Now the all byte code became ready for decompilation and I successfully did it. The newly decompiled code of the previous example would look like
package com.fff.rouble;
public class a94 {
b3 c5x_ = new g3();
public int h7z (){}
...
}
And still the decompiled code doesn't make much sense I became able to make changes in the code and recompile changed sources. Next step was to find a place where the gamers’ list was populated.
Searching was in fact very simple and quick. The list must be filled up from the network so I found all points where a family of read functions of java.io.DataInputStream class was called and in 30 minutes found the exact place. What I needed then is to interrupt incoming information and before writing it to the applet list also make a copy for my self. I created another class under the applet’s package (com.fff.rouble.Writer) whose functions were called for every new row coming to the list. As known all classes used by an applet must be loaded through the same classloader instance – the fact that mandated me to use the same package as the applet did. After compilation I made a reverse transformation (leading a became 0, b – 1 and so on; the package name returned to the original one) and replaced changed classes in the applet’s code. By finishing this task basically intrusion was completed and the project arrived to next stage namely - polishing/optimizing.
Basing
After all principal tasks where accomplished and tested manually it came to the agenda to automate the whole process and to deal with performance and optimization. Don’t forget that the target wasn’t just for a bet interest but rather it should be a complete robust efficient environment for pumping data. Don’t forget about numbers of different rooms and number of potential rows in all the lists. Several directions ought to be worked over – namely, storing data, running/ending applets and room choosing.
A RDBS was chosen as a persistence storage. I leave out here a description of this part because there were nothing out of the way to mention there.
To run the applet I decided to use browsers. As I said a browser (mine was IE 5.0) caches the applet code and cookies from the site hence all I needed to get to a room was a URL. I wrote a spider that captured all entry-point pages to the rooms and after parsing the list of URLs was built. By automating the whole process I meant to have a starter that navigates browsers to a certain URL waits some time and gets browsers to the next address. Such a starter was written. It turned out, that IE is not supposed to switch between applets easily and that its JVM is not being cleaned up (memory occupied by a browser was growing up heavily). Terminating browsers merely induced the system to got stack (IE apparently integrated too tightly with Windows). So my starter became a finisher also by sending simply a WM_CLOSE message to the browser’s window. Next weak point of the system was to choose the time when to send such a message. Originally the starter switched URLs after some period of time (about 30 seconds) in a loop. It was tremendously inefficient – at best the applet completed all the job after 10 seconds and stupidly waited to be navigated to another room other 20 seconds, much worse if it got to an empty room but at worst was when 30 seconds wasn’t enough to complete the pulling. To resolve this problem I added another class to the applet that after processing the list called a special native function. This function, in turn, sent a unique throughout the system message to the starter letting know it that a WM_CLOSE message could be sent to the browser. I augmented the timeout to 1 minute and it became to play just a fuse’s role. This trick let me increase performance by three times!
Room choosing was a must part since a number of logged in users varied significantly between rooms during all the time and I couldn’t afford getting to empty rooms. As I mentioned there were a snapshots of room states by parsing that I could reckon an amount of users in rooms and not to navigate if the amount was too low.
There were some small and tiny tunings but basically I described all parts of the system and all key artifices that made the system a pretty efficient program. My friend claimed that during 4 months it collected about 5 millions users (what’s the great team worked on this game site that they have such a massive community!).
Fortification
I’ll give some ideas how I would protect my applet from such an influence. But don’t forget that there is no program that could be 100% protected. Moreover in the Java world we may only talk about hampering outer penetration but not excluding it (since everyone may write own JVM). Let me advise an “antidote” for every case described in the above.
The last approach was getting through the cached code. So to protect from it the server should not allow to cache an applet (it can be achieved by a number of ways). But, as usual, it’s a double-edged sword. By forcing users to wait every time they want to use the applet while it’s downloaded may alienate users.
Solution number two, as I said, may meet two problems – SSL-based connection and a proprietary protocol. Each of them can be used to protect the system.
The first solution is the most hardly to accomplish and as well hard to protect from. A stubborn enough programmer can replace some core JVM classes and control the whole life of your applet and hence all its data (even if you don’t allow to cache the applet and use client-side certificate). From my point of view we can’t protect from this approach.
Conclusion
After I explained my practical solution and gave some advice how to protect from it I have to mention that the point of defence is rather a peculiar one. Most of security systems protect user’s data from “bad guys” but not from the user itself. The point in my task was not to hack the system nor to get some secret data but to write down in an automatic fashion what a user legally sees. I don’t think that practically someone may deliver data that he or she wouldn’t like a user to save. But, again, my task was very unusual and during the project I found some interesting solutions that I with pleasure share with you.
Thursday, April 10, 2008
How to protect your applet from grabbing data it represents?
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment