Thursday, March 21, 2019

pagetable adventures in linux x86_64

i'm teaching CS 149 (Operating Systems) this semester. it is one of my favorite classes! we are currently covering virtual memory and page tables. i'm using the operating systems: three easy pieces book which covers the topic well; however, after my lecture on it, i felt that the students needed a way to see the page tables in action; i wanted to let them look at the page tables of a process in real time. it would allow them to see the data structures involved, walk through the resolution, and get the final mapping. turns out, doing it was a bit harder than i anticipated.

accessing the top level page table

we are using x86_64 linux which has a 4 level page table. philipp oppermann has an amazing explanation of x86_64 page tables. i highly recommend checking it out!

the first step to accessing the top level page table is reading the CR3 CPU register. unfortunately, reading CR3 is a privileged operation. fortunately, allan cruse from the university of san francisco wrote a kernel module for exposing CR3 through /proc/cr3. it needed a bit of adapting to make it work with x86_64 and the new /proc interface, but i got it implemented: https://github.com/breed/virt2phys/blob/master/kernel-module/cr3.c.

with the information from CR3 we can get the address in physical memory of the top page table for the current process. the top page table is 4K in size and contains 512 entries of addresses to the next level page tables. our next task is reading these tables from physical memory.

give me a physical page!


in the good old days, there was this intriguing file in /dev called /dev/mem. when i first started using linux and before i completely understood virtual memory, that file remained a half understood mystery. i did learn that you could sometimes recover emails and editing sessions that you prematurely canceled by grepping it, but i never actually had a need to use it in a program.

it turns out /dev/mem isn't mysterious at all! it allows you to access physical memory as if it was a file. (technically it is a character device, but UNIX allows character devices to be interacted with as if they were files :) ) you simply open() /dev/mem, lseek() to the offset in the physical memory that you want to access, and then access the physical memory with read() or write().

/dev/mem is perfect for what we need to do! tragically, /dev/mem has effectively been disabled in recent kernels. it is a huge security hole! you can recompile the kernel to enable it, but i didn't want to require students to do that to examine page tables.

so i went the more difficult route of expanding the kernel module to also expose /proc/page_reader. this file allows you to lseek() to the physical page you want to read and then read its contents.

putting it all together


now that we have access to CR3 and physical pages, we can chop up the virtual address into its 5 components: the 4 9-bit indexes into the 4 levels of page tables and the 12-bit offset into the 4K page.

here is an example run of the pagetable program in https://github.com/breed/virt2phys/blob/master/pagetables.c. (we run it with sudo and we insmod the cr3.ko module before we run it.)

CR3 is 6F1D0006
data(): addr 000015010D00D000 -> 02A 004 068 00D 000
Need to resolved entry 02A in 000000006F1D0000
PAGE TABLE for 000000006F1D0000 (non zero entries):
  02A 8000000078173067
  0AB 800000007915a067
  0FE 800000006f228067
  0FF 800000006f7b0067
  136 0000000075f60067
  170 000000007d144067
  1B0 00000000702e2067
  1F6 000000007f73d067
  1FC 000000007ff3a067
  1FE 0000000075d1c067
  1FF 000000007580e067
Got PTE 8000000078173067
Need to resolved entry 004 in 0000000078173000
PAGE TABLE for 0000000078173000 (non zero entries):
  004 000000006f6ff067
Got PTE 000000006F6FF067
Need to resolved entry 068 in 000000006F6FF000
PAGE TABLE for 000000006F6FF000 (non zero entries):
  068 0000000076762067
Got PTE 0000000076762067
Need to resolved entry 00D in 0000000076762000
PAGE TABLE for 0000000076762000 (non zero entries):
  00D 80000000479ec867
Got PTE 80000000479EC867
data(): virt 000015010D00D000 -> phys 00000000479EC000
------------------
here we see the address 0x15010d00d000 breaks up into 4 9-bit indexes 0x2a, 0x4, 0x68, and 0xd. CR3 is pointing at 6f1d0000 (the low 12-bits are used for flags), so our top level page table is stored in the physical address 6f1d0000. we grab the 4K of data stored at 6f1d0000. now we need to find the 0x2ath (0 based) page table entry in that 4K of data. each page table entry is a 64-bit integer, so we can cast the page table data to a int64_t *pte and then look at pte[0x2a] which is 8000000078173067.

the top bit of 8000000078173067 (8) is the NX bit; it means that we are mapping memory that does not contain executable code. (there is that security again!) the the page table entry's bottom 12-bits are flags, so we need to mask those off to get the physical address, which is 78173000, for the 2nd level page table. we are going to do this page retrieval and indexed look up three more times until we finally get the physical address of the page that holds the data. we then use the 12-bit offset, which is 0, in this case, to get the offset into that page to find the exact bytes that we are looking for.

conclusion

virtual memory and page table resolution is a fascinating bit of black magic that makes our life as a programmer pretty awesome! peeking behind the curtains can help you understand what is really happening when you run your code. in the next post i'll delve deep into the real magic involving COWs and demand paging.

Monday, January 7, 2019

Samsung S9 ruined by Samsung customer care

tl;dr I got my S9. Loved it. It broke. Samsung couldn't fix it. Left me without a phone for over 2 months before offering to exchange it. Hopeful ending.

The S9 is a marvel


I was leaving Facebook and for the first time in a long time I needed to buy my own phone. I narrowed the choice down to a Pixel 3 or Samsung S9. I had owned the original Pixel and really liked it. I had previously owned an S6 Edge, and while the curved edge looked cool. I found the experience with the curved edge to be subpar.

In the end the choice was pretty easy:

  1. I wanted Samsung pay. The ability to work with swipe readers is pretty cool!
  2. The S9 had a bigger battery.
  3. It has a microSD card slot!
  4. I respected Samsung as a company. I had worked with their QA and I knew that they made sure their products were going to work well.
On July 12, 2018 I received my new S9. It was a beautiful phone. I loved the design of the phone and the software. (I did turn off Bixby...) I always ended up holding it gently in my palm like a beautiful piece of art.

I am a klutz


Three days later my phone was cracked. Not to point fingers, but my gentle holding combined with vigorous gesturing by my amazing wife resulted in the phone flying through the air and hitting the ground. It was the first and last time it hit the ground.

The phone continued to work fine apart from the ugly crack. We were traveling, so there wasn't anything to do until we got home a week later.

ubreakifix to the rescue


Fortunately, my Costco Citicard had buyer protection, and after traveling I was able to get the screen fixed at the Samsung authorized shop, ubreakifix. They did a great job and the phone was as good as new.

Something was not right


I used the phone a lot! I bought a case to prevent further injury since my phone is such an essential part of my life. We don't have a home phone, and my work number is forwarded to my cell phone.

Occasionally I would notice that my phone had rebooted. I thought perhaps it was an automatic software update, but soon it started happening multiple times a day. Finally, it became unusable: every time the phone went to sleep it would power itself off.

Early experience with Samsung customer care

I contacted customer care near the end of October. They suggested that I do more and more destructive data resets of the phone. Clearly they believed that an app was doing it. Sadly, I agree it is possible on Android, although even if the OS doesn't prevent an app from taking down a phone, it should at least detect it. I don't know if Samsung's tweak of Android can do such detection, but customer care was clearly shooting in the dark.

I found that doing a factory reset would make the phone work long enough to believe that the problem was resolved, but after a day of use the problem would come back. So, on October 25th I sent it in. I had to mail it to the service repair center in Texas, which was a huge pain since this is my only phone. I asked for a loaner or exchange or something, but they said I didn't qualify. So off it went. I was sad, but hoping to get a working phone back.

The first repair


I was informed on October 30th that they started working on the phone. On the 31st I was informed they my repaired phone was coming back to me. The root cause: "No problem found". I knew that was impossible. I guessed (correctly it turns out) that since I had to do a factory reset on the phone before I sent it in that  they ran through standard diagnostics, and when everything passed, they sent the phone back.

I went online an begged them not to send the phone back and instead look deeper. Here are some key excerpts from the chat session (User is me):

AGENT_NAME (Samsung Agent)(10-31-2018 01:37:04 AM)
I see that your phone repair has been completed and an UPS label is created. The scheduled delivery date will be updated once it is picked up by the UPS team.
AGENT_NAME (Samsung Agent)(10-31-2018 01:37:14 AM)
Here is the return tracking number: UPSTRACKING.
User (10-31-2018 01:37:40 AM)
please DO NOT SEND THE PHONE BACK!!!! IT DOES NOT WORK! giving me a USB cable will not fix the problem.
User (10-31-2018 01:38:16 AM)
if it is too late to stop the shipment can you please send another box for me to send the phone back to you? i cant use a phone that is continually powering itself off.
AGENT_NAME (Samsung Agent)(10-31-2018 01:38:59 AM)
I see that your phone software is updated and passed all the functional testing at our service center.
AGENT_NAME (Samsung Agent)(10-31-2018 01:39:14 AM)
Rest assured, the device would meet your expectations once you received it.
User (10-31-2018 01:40:09 AM)
the phone software was already uptodate. if it still powers off, what do i do?
AGENT_NAME (Samsung Agent)(10-31-2018 01:40:15 AM)
Our technicians has examined and performed functional testing for the long period of time and it is certified to be fully functional by our experts and it will work as good as a new one.
User (10-31-2018 01:40:21 AM)
according to the tech it is marked as no problem found.
User (10-31-2018 01:41:16 AM)
so if the phone is still experiencing the problem, what do i do?
AGENT_NAME (Samsung Agent)(10-31-2018 01:41:25 AM)
As per the ticket status, they did every testing on your phone and updated the phone software again to make sure it is working fine.

...

 

User (10-31-2018 01:44:07 AM)
i guess there is no other option that wait to escalate. so when it starts powering off again, should i take a video and then get back on this chat with the ticket number?
AGENT_NAME (Samsung Agent)(10-31-2018 01:44:35 AM)
I understand how crucial to have the phone with you. I can assure you that the device you would receive is fully functional and you'd be able to use your device as you were able to before.
AGENT_NAME (Samsung Agent)(10-31-2018 01:45:30 AM)
Yes, you wouldn't have to start over if you come back to us with the chat id again.
User (10-31-2018 01:45:30 AM)
i'm sure i will be. it's just the continual powering off is really impossible. i miss so many calls and alarms.
AGENT_NAME (Samsung Agent)(10-31-2018 01:46:05 AM)
You will not experience any powering off issues again.
AGENT_NAME (Samsung Agent)(10-31-2018 01:46:15 AM)
Rest assured!

...

 

Clearly the agent had full faith in the repair center. The "Rest assured!" phrase has come to echo in my mind as I continued my decent into the hell that is Samsung customer service over the next couple of weeks.

Escalation


Needless to say the problem wasn't fixed. I held out some hope, but by the 2nd day after I got the phone back it was again powering itself off. I contacted customer support again and sure enough I got "escalated". That sounds like a good thing, but what actually happens is that I needed to call back multiple times (they need 2-3 business days to evaluate your request) and wait on the phone a LONG time just to find out that I need to send the phone back into repair. By now my phone hasn't worked for almost 3 weeks! Fortunately, when I was told to send the phone back again the service agent happened to mention that I could have walked the phone into ubreakifix to get it repaired. (Why wasn't I told that in the first place?!? I had explained how much I needed the phone for my day-to-day work.)

ubreakifix again


I made another trip to ubreakifix. The phone would power itself off all the time, so it was easy for them to see what was going on. They said that there must be something fundamentally wrong with the phone, but they thought that perhaps the battery replacement that was done when the screen was replaced might be causing the problem. (Evidently Samsung wants them to replace the battery every time the screen is changed, even if the phone is just a couple of weeks old...) They replaced the battery, and it looked like the problem was fixed. They kept it a couple of days just to make sure, and tragically, the problem started happening for them again. They told me that I would have to send it to Samsung repair.

Second trip to Samsung repair


I was skeptical that Samsung repair was going to do anything if I sent it in again, so I spent a few hours on a weekend to get a repro of the problem. I figured out that just by setting up Samsung Pay with fingerprints would cause the problem to happen. I called customer service so that I could make sure that repair saw the problem instead of resetting the phone. I even made a video:


They told me the repair center would reset the phone because they refuse to accept passwords for screen locks. I tried to explain that resetting the phone would make the problem go away for a few hours and they probably would send the phone back to me with no problem found again. Evidently, that wasn't enough to convince them to disregard the no screen lock password rule.

So I spent another couple of hours to create a repro without a screen lock. I made sure that customer care noted the steps in the ticket and I included detailed written instructions to repro the problem with the S9 when I sent it back.

After a week the phone was back. This time they had actually replaced some connector components. I pulled the phone out of the box and went through the repro I sent them. Here is a video of that:

Obviously, the problem was not only not fixed, but they didn't even try to repro the problem! When I called customer support, I complained that they didn't even verify the problem was fixed. They told me that the repair center had a policy of throwing away any extra instructions that arrive with the phone.

Third trip to Samsung repair

By now it is almost 2 months without a working phone. Again my issue was escalated, and again I was told to send the phone in. I literally begged them to do an exchange, but they said I did not qualify. I pointed out that I live in California and we have lemon laws here, but again I was told that only two trips to Samsung repair doesn't qualify for an exchange (the ubreakifix trip didn't count).

I sent it in again.

Pixel 3


This whole time I had been using my wife's old iPhone 6. It's an okay phone, but I wasn't used to the interface; it didn't work with android auto in my car; it didn't have all the apps I wanted; and it was a bit clunky. It was clear that the phone wasn't going to be fixed, so I bought a Pixel 3. I love it!

Still broken


For the third repair, they replaced the motherboard. By the time I got it back, I had been happily using my Pixel 3 for a week. Even then, there was something inside of me hoping that the phone would work. It didn't. Here is the unbox video:

 All they had to do was to try to setup Samsung Pay with fingerprints.

Request for refund


I called support again. This time I asked for a refund. I pointed out that I had gone over 2 months without a working phone, so I had purchased a replacement in the meantime. They said that they would put in a request to the escalation group that handles refunds and exchanges, but refunds are rare. (A representative on a later call to support said they never do refunds.) After waiting the requisite 5 business days I called back. I was transferred to the escalation group who told me that the request had been put in as an exchange, so they could only decided if I deserved an exchange or if I could send the phone back in for repair again. They would not do a refund. I had to wait 2 business days for a decision. Amazingly, I got a response on the 2nd day!

We've successfully received a request [4149092620] for the exchange of your current model SM-G960UZKAXAA, 354267096707101.

Please return the above referenced product to our facility using the UPS return label that will be sent to you via e-mail. Once your product is received at our facility, evaluated, and it is confirmed that the unit is in a warrantable condition with no physical damage your replacement un it will be shipped. An e-mail containing the exchange information and tracking number of the shipment will be sent to you at that time.
This is the answer I was looking for about a month ago! However, I now have a new phone. I don't need an exchange; I need a refund.

Help from twitter


I did appeal to Samsung support via twitter. I suspect the motherboard replacement on the 3rd trip in was due to their insistence. They did also offer to refund me $552.49 if I sent the phone back to them. I had received $200 off because I sent an iPhone in when I bought the S9. I pointed out that I felt that I needed $752.49 since I had to pay full price for the Pixel 3. I no longer had a phone I could do an exchange with. I also said I would accept $552.49 and the phone I sent in as a refund. They would not agree to either deal.

The calls

Here is a partial record of calls I made. I only included the longer calls. There were many other short calls. I also don't have the records of the initial calls since they were not made on my phone. (They want you to call from another phone so that they can ask you to reset/reboot your S9.)

10/24/2018 9:15 AM 4
10/25/2018 8:10 AM 17
11/03/2018 3:54 PM 63
11/04/2018 1:27 PM 37
11/16/18 9:14 AM 8
11/19/18 7:51 AM 9
11/26/18 5:25 PM 13
11/27/18 7:46 PM 6
11/27/18 9:35 AM 34
11/28/2018 08:06 AM 149
11/29/2018 02:28 PM 6
12/04/2018 06:35 PM 23
12/05/2018 05:13 PM 8
12/12/2018 7:20 AM 13
12/19/2018 1:27 PM 38
01/03/2019 09:48 AM 20

This is about 7.5 hours on the phone just to tell them over and over "My phone powers itself off. It wasn't fixed. Please send me a new one."

Conclusion


There is something very wrong with Samsung customer service!

  • They consider the need to send a phone to repair 3 times to not be an exceptional situation worthy of an exchange.
  • They don't repro problems in repair! I'm pretty sure the problem lies with the finger print reader, but Samsung repair seemed content the flail and let me do the testing.
  • I bought the phone directly from Samsung, yet every time I sent the phone in, and many times that I interacted with support, I had to send PDF of the receipt!?!
  • They spoke of the possibility of a refund many times, but it appears from my experience and comments from support, that they don't do refunds.
My Apple fan friends have made fun of me like crazy! They keep pointing out that if it was Apple, I could have walked into an Apple store and walked out with a working iPhone. It's true.

Samsung has a much better phone, but with customer service like what I experienced, it's clear why Apple is the better brand.

Samsung, please be the amazing company you could be! Fix your customer service.

If anyone else has any ideas for getting a refund, I'm all ears!


** UPDATE 1 **


This morning, 1/8/2019, I received an email saying I would get a refund! I called to get the amount that they would be refunding me and was told $552.49. I explained that I sent an iPhone in as an exchange, and I would either like that iPhone back or be refunded $752.49. The exchange person, Edgar, agree to $752.49! (Yay!) I requested an email confirming the amount, and he said it would send it 5 minutes after hanging up. I never received the email. (That call lasted 20 mins.)

Two hours later I called to find out what happened. The support person repeatedly said that they don't agree to the refund amount until they get the phone. I pointed out that Edgar had given me an amount and I was just trying to get the email confirmation he promised. She called exchange since I was "upset". (I'm not sure why requesting support to comply with their promises is being upset...) And after a long wait she said that the email was already sent. When I asked when, she said that it would arrive in 5 mins. I requested she wait to make sure. The email did arrive:

Hi , I did see were you were offered a refund ! Yes it looks like you once you send in your device your refund will be issued !

When I pointed out that the refund amount was not in the email, she reiterated that they will not know the amount of the refund until I sent the device in.

She said someone would call me this evening. I pointed out that I really just need a refund amount in writing, but she said that that would only happen after the call and that I should just wait by the phone.

I have a feeling support doesn't realize that our lives don't revolve around them. I don't understand why I have to do everything by phone anyway. That adds another 32 minutes of phone calls for today, bringing the total to 52 mins.

** UPDATE 2 **


Hey, it's all resolved! I was able to get a confirmed refund, and I even talked with the Vice President of Customer Care about my experience. I'm super happy this is on Samsung's radar. I have hope that my next customer service experience will be better. (I recently installed a SmartThings hub and some sensors. I was trying to avoid Samsung due to this experience, but it looks like the best solution out there... So now I'm even more deeply invested in Samsung.)

Thursday, November 1, 2012

connecting to osx using l2tp/ipsec from linux

recently i needed to connect to a network using L2TP/IPSec from my linux laptop. to be honest, i think l2tp/ipsec is a stack of crap. there are actually 3 levels of protocols involved: ppp/l2tp/ipsec. it's truly horrible and seems a bit daunting to setup. tragically when i tried to use the automatic setup tools, like network manager, they all failed miserably with some cryptic error.

so i started with the basics and debugged up the stack. fortunately, in the end i did some pair debugging (thanx saleem!)

i found that when i went step by step it slowly came together.

step 1: setup ipsec


i'm using racoon for the ipsec key exchange server, so first i need to configure that. we start with /etc/racoon/racoon.conf:

path pre_shared_key "/etc/racoon/psk.txt";
path certificate "/etc/racoon/certs";

padding {
        maximum_length 20;
        randomize off;
        strict_check off;
        exclusive_tail off;
}
remote VPNIP {
        exchange_mode main,base;
        nat_traversal on;
        dpd_delay 10;
        proposal {
                encryption_algorithm 3des;
                hash_algorithm sha1;
                authentication_method pre_shared_key;
                dh_group 2;
        }
}

sainfo anonymous address VPNIP udp {
        encryption_algorithm aes128;
        authentication_algorithm hmac_sha1;
        compression_algorithm deflate;
        lifetime time 60 minutes;
}
VPNIP is the ip address of your vpn server. this file simply sets up the key exchange parameters. one key parameter (that killed a couple of hours of debugging) is the nat_traversal which signals ipsec to do the encapsulation properly.

now we need to setup the preshared secret in /etc/racoon/psk.txt:
VPNIP SHAREDSECRET
now that racoon is configured, we just need to setup the policy for ipsec to kick in. we will write a simple shell script vpn-ipsec.sh:
#!/bin/bash

setkey -c <<EOF

flush;
spdflush;

spdadd VPNIP[1701] 0.0.0.0/0 udp -P in ipsec esp/transport//require;
spdadd 0.0.0.0/0 VPNIP[1701] udp -P out ipsec esp/transport//require;

EOF
this policy says to use ipsec if we are sending to port 1701 (l2tp) on VPNIP.

once we run vpn-ipsec.sh ipsec will be setup and we will be ready to startup l2tp.

step 2 setup l2tp

l2tp is made up of two parts: the tunneling protocol and ppp running over that protocol. the tunneling protocol really doesn't do much. it's not even authenticated. here is the /etc/xl2tpd/xl2tpd.conf:
[global]
ipsec saref = yes

[lac vpn]
lns = VPNIP
pppoptfile = /etc/ppp/options.l2tpd.client
redial timeout = 5
again there isn't really much here. the [lac vpn] line indicates that we are going to be initiating the connection to the l2tp server. vpn is the name of the connection. we will use this to actually make the connection.

the real work of actually handling the network traffic is done by ppp. we configure that in /etc/ppp/options.l2tpd.client:
ipcp-accept-local
ipcp-accept-remote
usepeerdns
lock
name USERID
password PASSWORD
make sure to replace USERID and PASSWORD with your user id and password. the first two lines are going to configure our local ip address for the tunnel. unfortunately it doesn't setup the route correctly. (it seems like it should... more investigation needed.) for now we will do it manually in the final step.

step 3: pull it all together

now that we have all the configuration files lets enhance the vpn-ipsec.sh script to start everything up:
#!/bin/bash

setkey -c <<EOF

flush;
spdflush;

spdadd VPNIP[1701] 0.0.0.0/0 udp -P in ipsec esp/transport//require;
spdadd 0.0.0.0/0 VPNIP[1701] udp -P out ipsec esp/transport//require;

EOF

xl2tpd-control connect om
while [ ! -d /proc/sys/net/ipv4/conf/ppp0 ]
do
        echo waiting on l2tp
        sleep 1
done

sleep 1

route add -net NETWORK ppp0

enjoy!

btw, if you need to debug you can turn on ppp and racoon debugging in the config files. turning on debugging on the server side also helps greatly! wireshark and tcpdump are your friends!

this really is just a hack. i still want to figure out how to get the network route to self configure. i also would like to figure out how keep the credentials in a more secure store.

Thursday, September 13, 2012

relinking an elf executable

in the olden days on AIX you could easily relink an executable by simply relinking: gcc -o newbinary oldbinary newcode.o. this made it very easy to patch a binary without doing a full build.

for example. lets say we have the following program:

myprog.c:
#include <stdio.h>

void foo()
{
    printf("foo\n");
}

int main()
{
    foo();
}
so you compile: gcc -o myprog myprog.c. when you run it, you see:
breed@maluca:~$ ./myprog
foo
awesome right? well after a few days, you get bored of seeing "foo", and you would really like to see "goo" instead. but alas, you have lost the source for myprog.c. what can you do? will you never see myprog output goo?

never fear elfsh is here! it's a very cool, albeit abandoned, project. lets patch myprog:

first we want to write a new function:

goo.c:

void goo()
{
    printf("goo\n");
}

then we compile it: gcc -c goo.c

if only we could relink myprog... well we can get close with elfsh. check out the following:

(elfsh-0.82-b2-dev@local) load ./myprog
 [*] Thu Sep 13 00:52:44 2012 - New object loaded : ./myprog
(elfsh-0.82-b2-dev@local) load ./goo.o
 [*] Thu Sep 13 00:52:48 2012 - New object loaded : ./goo.o
(elfsh-0.82-b2-dev@local) reladd 1 2
 [E] Failed to inject ET_REL with workspace
(elfsh-0.82-b2-dev@local) reladd 1 2
 [*] ET_REL ./goo.o injected succesfully in ET_EXEC ./myprog
(elfsh-0.82-b2-dev@local) redir foo goo
Found sect .text at off 50132
 [*] Function foo redirected to addr 0x0804313F <goo>
(elfsh-0.82-b2-dev@local) save mynewprog
 [*] Object mynewprog saved successfully
(elfsh-0.82-b2-dev@local) quit

first we load the executable and the object file we want to link in. then we add them together. (for some reason you have to do it twice...) finally, you remap the symbol foo to goo.

now we run mynewprog:

breed@maluca:~$ ./mynewprog
goo

yay! we now we can bask in the glory of goo!

tragically it appears that the elfsh project may be abandoned, and 64-bit support seems to be lacking. perhaps someone will get excited about it and start enhancing it again...

Wednesday, October 12, 2011

wifi enabling your xbox 360

my daughter received an xbox 360 from her friend for her birthday. i wanted to try out kinect, so i bought a used one from the local eb games and rented just dance 3 from gamefly. unfortunately it is a rather old 360, so i had to buy the ac adapter for it. (lame! the 360 is a really power hog and now we have to plug something else in?!?) i hook everything up, power on the xbox, and up comes "you need to do a system upgrade".

so, i go to system settings to setup the wifi to do the upgrade and that is when i discover -- no built-in wifi! really!?! the wifi adapter is another $40, but all i really need it for is the upgrade. so i pull out my trusty thinkpad and try a bridge. (wow, i should be getting paid for these product placements!)

it turns out to be really easy if you have a linux laptop with wifi and an ethernet port.

it helps to know a bit of background information:

private network addresses are networks not found in the internet. they are designed to be used for local private networks. you see them used in your home router. if you look at the address of your laptop right now, it will probably be 192.168.1.X. you can find a full list, and description, at http://en.wikipedia.org/wiki/Private_network.

we are going to use nat aka masquerading to route traffic from the xbox to through the laptop to the internet. when doing nat the laptop will rewrite network traffic flowing through it so that it looks like the traffic is coming from the laptop, which is connected to the internet. (the funny thing is, your laptop is probably connected to a wireless router which is doing exactly the same thing.)

when connecting network devices, such as a laptop and an xbox using twisted pair ethernet (in other words, using those glorified phone jacks) you usually use two cables to connect each device to a network hub. the hub is too much of a pain if we are only connecting two devices. an alternative is to use a crossover cable that allows two devices to directly connect to each other. in a sense it makes each device look like a hub to the other device. fortunately, any reasonably new laptop will have an ethernet port that supports auto-mdix, which will automatically detect if a cross-over cable is needed and adjust accordingly. this allows you to get the cross-over cable functionality using a normal ethernet cable.

finally, network devices use DNS to translate names into addresses, so we need the address of a DNS server to do this translation. fortunately, google provides an open DNS server for anyone to use. (right now you are probably using a DNS server provided to you by your internet service provider.) the address of the google DNS server is 8.8.8.8.

physically connecting xbox to the laptop

  1. connect the xbox and laptop together using a single ethernet cable.
okay that was simple...

setting up the networking on linux

we are going to use a private network to connect the two devices. our linux box which will be the router will have the address 172.16.17.1. our xbox will have the address 172.16.17.2. we also need to turn on routing on our linux box. finally, we need to tell the kernel to masquerade the packets coming from the xbox. we do this all with the following three simple commands. (make sure you are root when you do this!)
# ifconfig eth0 172.16.17.1
# echo 1 > /proc/sys/net/ipv4/ip_forward
# iptables -t nat -A POSTROUTING -o wlan0 -j MASQUERADE
that wasn't too hard either right?

setting up the networking on the xbox

go to the system settings menu and choose to configure the wired settings. by default it will do automatic configuration. we could set linux up to configure the xbox automatically, but that is more work. it's easy to setup manually:

  1. change the ip address to 172.16.17.2
  2. change the subnet to 255.255.0.0
  3. change the gateway to 172.16.17.1 (the laptops ip address)
  4. change the DNS server to 8.8.8.8 (the google dns server)
you should end up with a screen that looks like:


and there you have it. you are good to go! the update came down and i was able to play just dance 3 on my update xbox 360 with my kids.

Thursday, September 1, 2011

java profiling is confused by i/o


the other day a collegue came to me with output from a java profiler that showed that we were spending way too much cpu time doing i/o. (in our case the i/o was both network and disk i/o.) the application in question should be i/o bound but wasn't. we could see that by running top, so he was using the profiler to figure out where the cpu time was being spent.

modern operating systems are great about putting tasks waiting on i/o to sleep so that the cpu could work on other tasks or just wait until the i/o is finished. (In the dark ages your computer would hang during i/o. it was pretty amazing when microsoft released an operating system that would allow you to keep working while you formatted your disks!)

profilers work by taking periodic snapshots (or samples) of what a process is doing. after taking many such samples it can figure out where most of the time is spent in the code. an alternate way of profiling, much more accurate but with more overhead, will put probes in at each call entrance and exit and calculate the running time of each invocation of the call.

my collegue knew all of this of course, which is why he was concerned that our i/o functions had too much overhead. i pointed out that past experience had taught me that the java profiler counts the time that a thread waits on i/o as running time. thus, you cannot trust the profilers output when you are doing a bunch of i/o.

ever the skeptic, my collegue pointed out that that would be stupid. the profiler can see that a thread was waiting for i/o and thus should not count the waiting time, therefore i had to be wrong.

to prove my point i wrote the following short program:

import java.net.*;
import java.util.*;
class t {
  public static void main(String a[]) throws Exception {
    new Thread() {
      public void run() {
        while(true) {
          try {
             String s = "";
             for(int i=0; i<1000; i++) { s=s+i; }
             Thread.sleep(1);
          } catch(Exception e) {}
       }
    }}.start();
    System.in.read();
  }
}

all this program does is startup a thread that burns cpu by building a string and then throwing it away.
the main thread, in the meanwhile, is just sitting there waiting for input from stdin. obviously, one thread is cpu bound while the other is i/o bound. so lets see what the java profiler tells us after we let our test program run for a while under the profiler and then killed:

breed@laflaca:~$ java -agentlib:hprof=cpu=samples t
^CDumping CPU usage by sampling running threads ... done.
breed@laflaca:~$ cat java.hprof.txt
JAVA PROFILE 1.0.1, created Wed Aug 31 11:45:20 2011
Header for -agentlib:hprof (or -Xrunhprof) ASCII Output (JDK 5.0 JVMTI based)
...
CPU SAMPLES BEGIN (total = 966) Wed Aug 31 11:48:21 2011
rank   self  accum   count trace method
   1 54.76% 54.76%     529 300025 java.io.FileInputStream.readBytes
   2 17.60% 72.36%     170 300040 java.lang.Integer.getChars
   3 11.08% 83.44%     107 300039 t$1.run

according to this more than half of the time is spent in the read call! of course virtually no CPU time is spent there since it is waiting on i/o. ok, maybe it is a problem with sampling. lets try it with probes:

breed@laflaca:~$ java -agentlib:hprof=cpu=times t
^CDumping CPU usage by timing methods ... done.
breed@laflaca:~$ cat java.hprof.txt
...
CPU TIME (ms) BEGIN (total = 4946) Wed Aug 31 11:50:46 2011
rank   self  accum   count trace method
   1 12.94% 12.94%   23859 301024 java.lang.AbstractStringBuilder.append
   2 12.68% 25.62%   23858 301028 java.lang.AbstractStringBuilder.append
   3  6.61% 32.23%   23858 301031 java.lang.String.<init>

much better the top three calls are all related to string procesing, which we would expect. actually string processing makes up the top 20 calls. awesome, so perhaps we just need to use times... but wait. the read call never returned. what happens if we give our program some input (by pressing enter) before we kill it?

breed@laflaca:~$ java -agentlib:hprof=cpu=times t
^CDumping CPU usage by timing methods ... done.
breed@laflaca:~$ cat java.hprof.txt
...
CPU TIME (ms) BEGIN (total = 8431) Wed Aug 31 11:54:38 2011
rank   self  accum   count trace method
   1 49.91% 49.91%       1 301067 java.io.FileInputStream.read
   2  6.59% 56.51%   19795 301028 java.lang.AbstractStringBuilder.append
   3  6.35% 62.85%   19796 301024 java.lang.AbstractStringBuilder.append

yep, there's the read again!

so, what's the moral of this story? careful when using java profilers. they can be useful tools, but when dealing with processes that do lots of i/o you may need to skip over some of the results to get to the real performance problems.

Wednesday, July 6, 2011

a network layer proxy

i've been wanting to get this going for a long time. i'm often on untrusted networks, and it would be nice to use a secure tunnel to get onto the internet. luckily there is this cool program called redsocks that will interface with iptables to transparently use a socks proxy. this page was an excellent guide to setting things up.

first, we setup a chain to setup our routing policy. we will put all the policy in a chain called REDSOCKS:
sudo iptables -t nat -N REDSOCKS
sudo iptables -t nat -A REDSOCKS -d 10.0.0.0/8 -j RETURN
sudo iptables -t nat -A REDSOCKS -d 127.0.0.0/8 -j RETURN
sudo iptables -t nat -A REDSOCKS -d 169.254.0.0/16 -j RETURN
sudo iptables -t nat -A REDSOCKS -d 172.16.0.0/12 -j RETURN
sudo iptables -t nat -A REDSOCKS -d 192.168.0.0/16 -j RETURN
sudo iptables -t nat -A REDSOCKS -d 224.0.0.0/4 -j RETURN
sudo iptables -t nat -A REDSOCKS -d 240.0.0.0/4 -j RETURN
sudo iptables -t nat -A REDSOCKS -p tcp -j REDIRECT --to-ports 12345

the bulk of the rules (the ones with RETURN) are to not change the routing of connections to private addresses. the last rule is key; it will cause all other connections to get routed through redsocks. specifically, it will cause the kernel to forward connections to the port 12345 which redsocks will be listening on. redsocks then uses SOCKS to route the connection through ssh and out to the internet. (for the curious, i was, you figure out the original target of the socket using a sockopt.)

note that in this case the server we are sshing to is 127.0.0.1, which we cannot route through redsocks. (otherwise we have a chicken and egg situation.) if we are connecting to routable addresses, we need to add an entry to iptables. if X.X.X.X is the machine that we are sshing to, the table setup would be as follows:

sudo iptables -t nat -N REDSOCKS
sudo iptables -t nat -A REDSOCKS -d X.X.X.X/32 -j RETURN
sudo iptables -t nat -A REDSOCKS -d 10.0.0.0/8 -j RETURN
sudo iptables -t nat -A REDSOCKS -d 127.0.0.0/8 -j RETURN
sudo iptables -t nat -A REDSOCKS -d 169.254.0.0/16 -j RETURN
sudo iptables -t nat -A REDSOCKS -d 172.16.0.0/12 -j RETURN
sudo iptables -t nat -A REDSOCKS -d 192.168.0.0/16 -j RETURN
sudo iptables -t nat -A REDSOCKS -d 224.0.0.0/4 -j RETURN
sudo iptables -t nat -A REDSOCKS -d 240.0.0.0/4 -j RETURN
sudo iptables -t nat -A REDSOCKS -p tcp -j REDIRECT --to-ports 12345
one other thing to note is that the link above says to use the DNAT target rather than REDIRECT. for this part of the project both worked, but i couldn't get DNAT working for connections routed through my laptop from the router. we will get to how to setup this routing in the next post.

ok, now the REDSOCKS chain is setup, but nothing will happen since it is just a chain floating out there in iptables. we've got to connect it to a chain that is actively involved in packet processing. the chain we are looking for is the OUTPUT chain. this chain is used in the initial stages of local packet routing.
sudo iptables -t nat -A OUTPUT -p tcp -j REDSOCKS
the iptables has an excellent explanation of the iptable chains and how/why they are used.

with that last rule the kernel will start routing connections to port 12345. we need to start something listening on that port. that something is redsocks. you'll need to download it and compile it.

before we start up redsocks, we need to point it at a configuration file.

base {
        log_info = on;
        log = stderr;
        daemon = off;
        redirector = iptables;
}
redsocks {
        local_ip = 127.0.0.1;
        local_port = 12345;
        // `ip' and `port' are IP and tcp-port of proxy-server
        ip = 127.0.0.1;
        port = 1080;
        type = socks5;
}

nothing really tricky here. the local_port is the port that the kernel will use to forward connections to redsocks. ip and port are the ip address and port of the socks server. once you have created the config file, i called it redsocks.conf, start redsocks with:
redsocks -c ./redsocks.conf
of course make sure your socks server is up. from the previous post i was forwarding an ssh port to local port 2222, so i started up socks with:
ssh -p 2222 -D 1080 127.0.0.1
the -D 1080 flag tells ssh to service SOCKS clients on port 1080. so there you go. transparent tunneling over the internet. pretty cool huh? of course we have to do a bit more if we want to route connections through the machine.