Wednesday, October 12, 2011

wifi enabling your xbox 360

my daughter received an xbox 360 from her friend for her birthday. i wanted to try out kinect, so i bought a used one from the local eb games and rented just dance 3 from gamefly. unfortunately it is a rather old 360, so i had to buy the ac adapter for it. (lame! the 360 is a really power hog and now we have to plug something else in?!?) i hook everything up, power on the xbox, and up comes "you need to do a system upgrade".

so, i go to system settings to setup the wifi to do the upgrade and that is when i discover -- no built-in wifi! really!?! the wifi adapter is another $40, but all i really need it for is the upgrade. so i pull out my trusty thinkpad and try a bridge. (wow, i should be getting paid for these product placements!)

it turns out to be really easy if you have a linux laptop with wifi and an ethernet port.

it helps to know a bit of background information:

private network addresses are networks not found in the internet. they are designed to be used for local private networks. you see them used in your home router. if you look at the address of your laptop right now, it will probably be 192.168.1.X. you can find a full list, and description, at http://en.wikipedia.org/wiki/Private_network.

we are going to use nat aka masquerading to route traffic from the xbox to through the laptop to the internet. when doing nat the laptop will rewrite network traffic flowing through it so that it looks like the traffic is coming from the laptop, which is connected to the internet. (the funny thing is, your laptop is probably connected to a wireless router which is doing exactly the same thing.)

when connecting network devices, such as a laptop and an xbox using twisted pair ethernet (in other words, using those glorified phone jacks) you usually use two cables to connect each device to a network hub. the hub is too much of a pain if we are only connecting two devices. an alternative is to use a crossover cable that allows two devices to directly connect to each other. in a sense it makes each device look like a hub to the other device. fortunately, any reasonably new laptop will have an ethernet port that supports auto-mdix, which will automatically detect if a cross-over cable is needed and adjust accordingly. this allows you to get the cross-over cable functionality using a normal ethernet cable.

finally, network devices use DNS to translate names into addresses, so we need the address of a DNS server to do this translation. fortunately, google provides an open DNS server for anyone to use. (right now you are probably using a DNS server provided to you by your internet service provider.) the address of the google DNS server is 8.8.8.8.

physically connecting xbox to the laptop

  1. connect the xbox and laptop together using a single ethernet cable.
okay that was simple...

setting up the networking on linux

we are going to use a private network to connect the two devices. our linux box which will be the router will have the address 172.16.17.1. our xbox will have the address 172.16.17.2. we also need to turn on routing on our linux box. finally, we need to tell the kernel to masquerade the packets coming from the xbox. we do this all with the following three simple commands. (make sure you are root when you do this!)
# ifconfig eth0 172.16.17.1
# echo 1 > /proc/sys/net/ipv4/ip_forward
# iptables -t nat -A POSTROUTING -o wlan0 -j MASQUERADE
that wasn't too hard either right?

setting up the networking on the xbox

go to the system settings menu and choose to configure the wired settings. by default it will do automatic configuration. we could set linux up to configure the xbox automatically, but that is more work. it's easy to setup manually:

  1. change the ip address to 172.16.17.2
  2. change the subnet to 255.255.0.0
  3. change the gateway to 172.16.17.1 (the laptops ip address)
  4. change the DNS server to 8.8.8.8 (the google dns server)
you should end up with a screen that looks like:


and there you have it. you are good to go! the update came down and i was able to play just dance 3 on my update xbox 360 with my kids.

Thursday, September 1, 2011

java profiling is confused by i/o


the other day a collegue came to me with output from a java profiler that showed that we were spending way too much cpu time doing i/o. (in our case the i/o was both network and disk i/o.) the application in question should be i/o bound but wasn't. we could see that by running top, so he was using the profiler to figure out where the cpu time was being spent.

modern operating systems are great about putting tasks waiting on i/o to sleep so that the cpu could work on other tasks or just wait until the i/o is finished. (In the dark ages your computer would hang during i/o. it was pretty amazing when microsoft released an operating system that would allow you to keep working while you formatted your disks!)

profilers work by taking periodic snapshots (or samples) of what a process is doing. after taking many such samples it can figure out where most of the time is spent in the code. an alternate way of profiling, much more accurate but with more overhead, will put probes in at each call entrance and exit and calculate the running time of each invocation of the call.

my collegue knew all of this of course, which is why he was concerned that our i/o functions had too much overhead. i pointed out that past experience had taught me that the java profiler counts the time that a thread waits on i/o as running time. thus, you cannot trust the profilers output when you are doing a bunch of i/o.

ever the skeptic, my collegue pointed out that that would be stupid. the profiler can see that a thread was waiting for i/o and thus should not count the waiting time, therefore i had to be wrong.

to prove my point i wrote the following short program:

import java.net.*;
import java.util.*;
class t {
  public static void main(String a[]) throws Exception {
    new Thread() {
      public void run() {
        while(true) {
          try {
             String s = "";
             for(int i=0; i<1000; i++) { s=s+i; }
             Thread.sleep(1);
          } catch(Exception e) {}
       }
    }}.start();
    System.in.read();
  }
}

all this program does is startup a thread that burns cpu by building a string and then throwing it away.
the main thread, in the meanwhile, is just sitting there waiting for input from stdin. obviously, one thread is cpu bound while the other is i/o bound. so lets see what the java profiler tells us after we let our test program run for a while under the profiler and then killed:

breed@laflaca:~$ java -agentlib:hprof=cpu=samples t
^CDumping CPU usage by sampling running threads ... done.
breed@laflaca:~$ cat java.hprof.txt
JAVA PROFILE 1.0.1, created Wed Aug 31 11:45:20 2011
Header for -agentlib:hprof (or -Xrunhprof) ASCII Output (JDK 5.0 JVMTI based)
...
CPU SAMPLES BEGIN (total = 966) Wed Aug 31 11:48:21 2011
rank   self  accum   count trace method
   1 54.76% 54.76%     529 300025 java.io.FileInputStream.readBytes
   2 17.60% 72.36%     170 300040 java.lang.Integer.getChars
   3 11.08% 83.44%     107 300039 t$1.run

according to this more than half of the time is spent in the read call! of course virtually no CPU time is spent there since it is waiting on i/o. ok, maybe it is a problem with sampling. lets try it with probes:

breed@laflaca:~$ java -agentlib:hprof=cpu=times t
^CDumping CPU usage by timing methods ... done.
breed@laflaca:~$ cat java.hprof.txt
...
CPU TIME (ms) BEGIN (total = 4946) Wed Aug 31 11:50:46 2011
rank   self  accum   count trace method
   1 12.94% 12.94%   23859 301024 java.lang.AbstractStringBuilder.append
   2 12.68% 25.62%   23858 301028 java.lang.AbstractStringBuilder.append
   3  6.61% 32.23%   23858 301031 java.lang.String.<init>

much better the top three calls are all related to string procesing, which we would expect. actually string processing makes up the top 20 calls. awesome, so perhaps we just need to use times... but wait. the read call never returned. what happens if we give our program some input (by pressing enter) before we kill it?

breed@laflaca:~$ java -agentlib:hprof=cpu=times t
^CDumping CPU usage by timing methods ... done.
breed@laflaca:~$ cat java.hprof.txt
...
CPU TIME (ms) BEGIN (total = 8431) Wed Aug 31 11:54:38 2011
rank   self  accum   count trace method
   1 49.91% 49.91%       1 301067 java.io.FileInputStream.read
   2  6.59% 56.51%   19795 301028 java.lang.AbstractStringBuilder.append
   3  6.35% 62.85%   19796 301024 java.lang.AbstractStringBuilder.append

yep, there's the read again!

so, what's the moral of this story? careful when using java profilers. they can be useful tools, but when dealing with processes that do lots of i/o you may need to skip over some of the results to get to the real performance problems.

Wednesday, July 6, 2011

a network layer proxy

i've been wanting to get this going for a long time. i'm often on untrusted networks, and it would be nice to use a secure tunnel to get onto the internet. luckily there is this cool program called redsocks that will interface with iptables to transparently use a socks proxy. this page was an excellent guide to setting things up.

first, we setup a chain to setup our routing policy. we will put all the policy in a chain called REDSOCKS:
sudo iptables -t nat -N REDSOCKS
sudo iptables -t nat -A REDSOCKS -d 10.0.0.0/8 -j RETURN
sudo iptables -t nat -A REDSOCKS -d 127.0.0.0/8 -j RETURN
sudo iptables -t nat -A REDSOCKS -d 169.254.0.0/16 -j RETURN
sudo iptables -t nat -A REDSOCKS -d 172.16.0.0/12 -j RETURN
sudo iptables -t nat -A REDSOCKS -d 192.168.0.0/16 -j RETURN
sudo iptables -t nat -A REDSOCKS -d 224.0.0.0/4 -j RETURN
sudo iptables -t nat -A REDSOCKS -d 240.0.0.0/4 -j RETURN
sudo iptables -t nat -A REDSOCKS -p tcp -j REDIRECT --to-ports 12345

the bulk of the rules (the ones with RETURN) are to not change the routing of connections to private addresses. the last rule is key; it will cause all other connections to get routed through redsocks. specifically, it will cause the kernel to forward connections to the port 12345 which redsocks will be listening on. redsocks then uses SOCKS to route the connection through ssh and out to the internet. (for the curious, i was, you figure out the original target of the socket using a sockopt.)

note that in this case the server we are sshing to is 127.0.0.1, which we cannot route through redsocks. (otherwise we have a chicken and egg situation.) if we are connecting to routable addresses, we need to add an entry to iptables. if X.X.X.X is the machine that we are sshing to, the table setup would be as follows:

sudo iptables -t nat -N REDSOCKS
sudo iptables -t nat -A REDSOCKS -d X.X.X.X/32 -j RETURN
sudo iptables -t nat -A REDSOCKS -d 10.0.0.0/8 -j RETURN
sudo iptables -t nat -A REDSOCKS -d 127.0.0.0/8 -j RETURN
sudo iptables -t nat -A REDSOCKS -d 169.254.0.0/16 -j RETURN
sudo iptables -t nat -A REDSOCKS -d 172.16.0.0/12 -j RETURN
sudo iptables -t nat -A REDSOCKS -d 192.168.0.0/16 -j RETURN
sudo iptables -t nat -A REDSOCKS -d 224.0.0.0/4 -j RETURN
sudo iptables -t nat -A REDSOCKS -d 240.0.0.0/4 -j RETURN
sudo iptables -t nat -A REDSOCKS -p tcp -j REDIRECT --to-ports 12345
one other thing to note is that the link above says to use the DNAT target rather than REDIRECT. for this part of the project both worked, but i couldn't get DNAT working for connections routed through my laptop from the router. we will get to how to setup this routing in the next post.

ok, now the REDSOCKS chain is setup, but nothing will happen since it is just a chain floating out there in iptables. we've got to connect it to a chain that is actively involved in packet processing. the chain we are looking for is the OUTPUT chain. this chain is used in the initial stages of local packet routing.
sudo iptables -t nat -A OUTPUT -p tcp -j REDSOCKS
the iptables has an excellent explanation of the iptable chains and how/why they are used.

with that last rule the kernel will start routing connections to port 12345. we need to start something listening on that port. that something is redsocks. you'll need to download it and compile it.

before we start up redsocks, we need to point it at a configuration file.

base {
        log_info = on;
        log = stderr;
        daemon = off;
        redirector = iptables;
}
redsocks {
        local_ip = 127.0.0.1;
        local_port = 12345;
        // `ip' and `port' are IP and tcp-port of proxy-server
        ip = 127.0.0.1;
        port = 1080;
        type = socks5;
}

nothing really tricky here. the local_port is the port that the kernel will use to forward connections to redsocks. ip and port are the ip address and port of the socks server. once you have created the config file, i called it redsocks.conf, start redsocks with:
redsocks -c ./redsocks.conf
of course make sure your socks server is up. from the previous post i was forwarding an ssh port to local port 2222, so i started up socks with:
ssh -p 2222 -D 1080 127.0.0.1
the -D 1080 flag tells ssh to service SOCKS clients on port 1080. so there you go. transparent tunneling over the internet. pretty cool huh? of course we have to do a bit more if we want to route connections through the machine.

Tuesday, July 5, 2011

fun with ssh



for the 4th of july weekend we headed to los cabos, mexico for a family reunion. we stay at this really nice resort, barcelo, and they have these really cool plasmas on the wall. we start playing with them: they have games and a web browser and come with a keyboard. you can facebook on it! pretty cool.

i notice that it's running google chrome and then my son finds gnu chess. definitely running linux, so 30 seconds after turning on the tv we have a shell. (you just have to push alt-f2 and startup bash :)


now, what can you do with a shell? more specifically, since we don't want to mess up anything or use exploits, what can you do from user space?

well we don't actually know the guest (huesped in spanish) password, so we are further limited. but there are a couple of things i noticed:
  1. the wifi network, which you have to pay $20 a day for, is accessible from the network (ethernet) that the room is hooked up to. the wifi network is one of those that give you an ip address, but doesn't route you to the internet until you pay.
  2. there is some sketchy proxying going on. i'm sure they aren't doing anything malicious, but i'm a bit paranoid about these things.
  3. it has ssh.
  4. we brought our wifi router from home. (ok, i kind of knew that before we walked into the room :)
one other thing to consider is that we have a bunch of wifi devices: 3 itouches, 2 computers, and 2 phones. so, i do the project i've been wanting to do for a long time! transparent socks proxying at the network layer. because there are other devices involved i'm throwing in ip masquerading on top.

if i wasn't worried about observation number 2, i could have just started up a simple socks proxy on the tv. instead i use a remote vps (basically a web service shell acount)  i have from jaguarpc.

getting internet access


before i start, let me just point out this was a research project. i actually got legitimate wifi access after i got this working. you should too. $20 is kind of a rip off, but not a total one. (it does seem to go against the notion of an "all inclusive" resort though...)

first, setup my laptop to connect to the wifi network. whenever i try to connect anywhere i get the prompt for the login or payment for internet service. we'll fix that in just a second.

the easiest option to get internet access would be to ssh from the tv to my external ssh account using:
ssh -g -D 1080 myserver.me
but that would open up access through my server to everyone. of course, i'm just being paranoid since why would anyone look for an open socks proxy on the tv in our room?

since I am paranoid i do something better:
ssd -R 2222:myserver.me:22 laptop-address
the laptop-address is the address of the wifi interface of my laptop. once i ssh into my laptop, i can now use
ssh -D 1080 -p 2222 myserver.me
to connect myserver.me from my laptop. since i'm using the -D option, i also have a fully functional local socks proxy going. so, i change the proxy of my web browser of my laptop to 127.0.0.1:1080 and voila (or andale since we are in mexico) i'm on the internet from my laptop.

ok, so this is a start. it's such a pain to setup socks settings, and i want to allow the rest of my devices to connect through, so of course i didn't stop there.

Monday, June 13, 2011

pedantic coding

(hopefully this is obvious, but these opinions are my own and should not be construed to represent those of my employer or the zookeeper project. both sometimes, erroneously :), disagree with me.)

lately a open source project i'm involved with, zookeeper, was the subject of a rant about clean code.

i'm a big believer in clean code and best practices. some of those best practices also include things like "if it aint broke, don't fix it".

one reason to write clean code is to avoid bugs, but another important reason is to make the code maintainable. if best practices are followed, the next developer who subscribes to these same practices can jump into the code quickly.

as i've worked on different code bases, i have found some really nice reasons to break best practices. for example, in the linux code base there is this common pattern:
void some_function() {
        ... initial section ..
        if (error) goto out1;
        ... more setup ...
        if (error) goto out2;
        ... finish the work ...
out2:
        ... cleanup from more setup ...
out1:
        ... cleanup from initial section ...
}  
the above code breaks the cardinal rule of C "goto considered harmful". so harmful in fact that some programmers don't even know that C has a goto. i do not use gotos myself; however, that pattern is extremely useful in the kernel where setup and teardown of structures can be complicated and can cause big problems if done incorrectly. i wouldn't call such code unclean.

other code bases have similar issues and still subscribe to the "no goto" rule. one alternate way to achieve the above is:
void some_function() {
        do {
                ... initial section ..
                if (error) break;
                do {
                        ... more setup ...
                        if (error) break;
                        ... finish the work ...
                } while(0);
                ... cleanup from more setup ...
        } while(0);
        ... cleanup from initial section ...
}
this form has many of the benefits as the earlier code and doesn't use goto. the while(0) construct does look a bit strange when you first look at it though. it's also not as easy to see where the break is going to take you. also, the indentation gets quite deep. personally i like the gotos better.

it can get a bit out of control though. i just finished teaching a graduate OS class on Linux. there is one particularly hairy piece of code that is central to the cache manager that really shows why gotos are scary! checkout do_generic_file_read in the linux kernel. it's pretty out of control, but it is also doing something pretty complex. i'm not sure that making the code clean by removing all the gotos and refactoring to get smaller functions would make it better. it may, but what we have works well, and once you read through it you realize that it isn't that hard to follow.

so where am i going with all of this? i think we need to differentiate between clean code and pedantic code. i propose that clean code is something that follows effective programming rules with an eye toward readability and maintainability, while pedantic code strictly follows effective programming rules without regard to readability and maintainability. clean code has a degree of subjectivity and may occasionally violate a rule, while pedantic code can be objectively and precisely measured.

some of this is summed up nicely in the introductory chapter of the book Clean Code:
We are willing to claim that if you follow these teachings, you will enjoy the beneļ¬ts that we have enjoyed, and you will learn to write code that is clean and professional. But don’t make the mistake of thinking that we are somehow “right” in any absolute sense. There are other schools and other masters that have just as much claim to professionalism as we. It would behoove you to learn from them as well.
there is also the issue of working code vs new code. there is always a trade off between making minimal changes to code to fix a bug, improve performance, or add some other improvement, and cleaning up code. especially with a mature code base, minimal changes minimize the chance of introducing new bugs. (fixing a bug and introducing a worse new bug with the fix is terribly tragic!) cleaning up code can really help long-term maintenance, if done correctly, but it must be weighed against the big short-term risk of code instability and bugs. it is great to occasionally bite the bullet and rip out the old and put in the shiny new, but you need to do it when the old has quiesced a bit and you can have a shiny new release that people will test and use with caution.

i would like to conclude with these ideas: clean is not an objective universal metric, there are different views of clean; it helps expose bugs, but it doesn't make them go away; code maturity needs to balanced against code cleaning; and, as i mentioned earlier, cleanliness is something to strive for, but don't be pedantic about it. (yes, i purposely ended that clause with a preposition :) i don't think you should declare something not production worthy based solely on your definition of cleanliness. i would love to see a cleaner, according to my definition of clean, version of  do_generic_file_read, but i don't hesitate to rely on it everyday for my livelihood.