奇妙的巨蟒Writeup

Posted by Pwnhub on 2017-08-29

上周五本胖在有各种比赛各种会议的情况下,依然坚挺地开了一期 Reverse 比赛,可谓凑了一份热闹~本来几乎不抱希望能够有人做出来(毕竟出题人对我说他也做不出来…),但是没想到最后关头还是有真爱粉提交了 Flag 和 Wp,简直让人热泪盈眶!

下面我们来看看这位真爱粉 @Hcamael 的 Writeup 吧——为真爱鼓掌:


本题是一道 pyc 逆向题,用 uncompyle6 没法跑出来,所以猜测又是考 python 的 opcode

之前做过相关研究,也写过一篇blog:http://0x48.pw/2017/03/20/0x2f/

主要是两个参考文档:

  1. opcode.h
  2. opcode 具体代表的操作
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
>>> import dis, marshal
>>> f = open("./final.pyc")
>>> f.read(8)
'\x03\xf3\r\nT\x16xY'
>>> code = marshal.load(f)
>>> code.co_consts
.....(输出太多了省略)
>>> code.co_varnames
('DIVIDER',)
>>> code.co_names
('a', 'b', 'int', 'str', 'i', 'q', 'raw_input', 'True', 'aa', 'xrange', 'OO00000O0O0OO0OOO', 'sys', 'OO000OOO0O000O00O', 'time', 'OO0OO00O0OO0OO00O', 'False', 'r', 'marshal', 'c', 'x', 'p', 'None', 'f', 'args', 'kwargs', 'u')

我们主要关心上面这三个集合,分别是co_consts, co_varnames, co_names

其中:

  • co_consts 集合中包含的是该模块中定义的各种数据类型,具体的值,比如定义的对象,赋值的字符串/int型数据
  • co_varnames 表示的是当前作用域的局部变量的变量名

接下来就是看具体的 opcode :

1
2
3
4
>>> dis.disassemble_string(code.co_code)
Traceback (most recent call last):
IndexError: string index out of range
string index out of range

但是发现报错,跑不了,我的思路是一句一句看,从 opcode.h 头文件中可以看出 opcode 为 1 字节,再加上操作数的话就是 3 字节,所以一句指令的长度是 1 字节或者 3 字节,所以:

1
2
3
4
5
>>> dis.disassemble_string(code.co_code[:3])
0 JUMP_ABSOLUTE 3292
>>> dis.disassemble_string(code.co_code[3292:3292+3])
0 JUMP_FORWARD 24 (to 27)

我使用上面的方面进行简单的测试,发现有一大堆的JUMP_ABSOLUTEJUMP_FORWARD指令,这时就知道这里有混淆了。

参考文档,我们可以知道JUMP_ABSOLUTE是跳到绝对地址,JUMP_FORWARD是下一个地址加上操作数,比如0 JUMP_FORWARD 24 (to 27) ,当前地址 0 ,当前指令是 3 字节,下一个地址是 3 ,加上 24 是 27 ,所以执行完这句指令是跳到 27 ,这里我只是举例,在本题中,地址 0 是从 code.co_code[0] 开始

该模块的最外层很麻烦,追了一会指令流就看不住了,然后就看定义的函数:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
>>> func_list = []
>>> for x in code.co_consts:
... if type(x) == type(code.co_consts[0]):
... func_list.append(x)
>>> for x in func_list:
... print x.co_name # 函数名
a
q
a
a
a
aa
a
a
aa
r
a
a
a
p
a
a
a
f
a
<setcomp>
<dictcomp>
u
>>> for x in func_list:
... print x.co_consts
(None, 2574606289, None)
(None, 'hex', '', 2269302367, 3999397071, 3212575724, 4011125418, 2541851390, 3101964664, 4002314880, None)
(3363589608, None)
(None, 928441828, None)
(None, 1, 2827689411, 3340835492, None)
(0, 3149946851, 1915448404, None)
(None, 1, 1761489969, None)
(None, 3346499627, None)
(0, 804230483, 1849535108, None)
(None, 18, 1, 0, 811440571, 694805067, 1480591167, 2317567929, None)
(None, '', 103332102, 3569318510, 2445961406, 2136442608, 3449813582, None)
(None, 1254503156, None)
(None, 1, 3745711837, None)
(None, 13, 25, 254, 256, 184, 139, 1, 2, 3, 158, 161, 21, 10, 251, 142, 128, 115, 5, 99, 28, 130, 253, 17, 219, 88, 180, 14, 83, 119, 101, 7, 57, 178, 91, 245, 207, 0, 249, 166, 230, 85, 8, 213, 134, 240, 4, 199, 255, 202, 6, 30, 9, 173, 69, 227, 124, 15, 141, 205, 170, 11, 133, 218, 149, 12, 193, 67, 24, 16, 103, 151, 145, 4002470191, 2521589842, 1264028523, 1557840806, 2269633706, 951771769, 1948225321, 2840041954, 240350730, 2835968845, 1344465054, 1832969381, 414996033, 893304341, 1033856613, 2005820485, 1655033734, 383297387, 1110377909, 1331741225, 98787899, 3245587348, 3507579705, 2710942562, 408230478, 4193925412, 4258146773, 3555027567, 2696796853, 3228309104, 1702138493, 878416672, 1840033377, 2212037170, 1264539365, 155548767, 3125510233, 2468296542, 2105197060, 1611521139, 2978471848, 3090963965, 3551862263, 4190549182, 1060650455, 418207362, 2505390665, 148314961, 1392669086, 3687927788, 740579929, 2902468892, 3221147519, 1094609218, 2451398154, 2409455404, 3351906386, 2473439137, 3475738179, 1904786329, 3519084889, 979327822, 2909197751, 2846946149, 3980818176, 4127800602, 1291996042, 4037586272, 2675091267, 199113052, 710970151, 1897807508, 1373489195, 1776856572, 1804854838, 1781505473, 3306320587, 1760320652, 860749406, 161432034, 3258951656, 2792565458, 1916846289, 2023044049, 1935716574, 1285095588, 3035625565, 3586006421, 2368742222, 3131839710, 2298893290, 1460710676, 4009727955, 2535652387, 19895811, 2953554646, 1834358963, None)
(None, 1, 156819970, None)
(None, 1, 2362387540, None)
(None, 807794131, None)
(None, 1, <code object O0O00O0OOOOO000OO at 0x7fd9b10f6ab0, file "ck.py", line 200>, 2901513116, 1218601877, 625447945, None)
(None, 2014553041, None)
(1296050898, 2236454079, 1998426264, 3102970915, None)
(2343257866, 676615509, 2173771105, 697135550, 1974986440, None)
(None, '', 'Wrong key!', 'Good job! The flag is pwnhub{flag:your input(lower case)}', 3463300106, 3857901018, 3949890875, 174919631, 1639824250, 433978434, 3710075802, 161154336, 33478671, 2489981027, 1574135945, 3935706030, 1700692433, 832561131, None)

随便看了下各个函数的相关信息,发现 u 函数中有 flag 相关信息,然后开始逆 u 函数,首先收集下 u 函数的相关变量信息:

1
2
3
4
5
6
7
8
9
10
11
12
13
>>> u = func_list[-1]
>>> u.co_argcount
1
>>> u.co_varnames
('OOO000OOOOOO00OOO', 'OOOO000OO000OOOOO', 'DIVIDER')
>>> u.co_names
('q', 'r', 'p')
>>> u.co_consts
(None, '', 'Wrong key!', 'Good job! The flag is pwnhub{flag:your input(lower case)}', 3463300106, 3857901018, 3949890875, 174919631, 1639824250, 433978434, 3710075802, 161154336, 33478671, 2489981027, 1574135945, 3935706030, 1700692433, 832561131, None)

写了个脚本,自动追踪指令并输出,但是跳过两个JUMP指令的输出,然后又发现了一个控制流平坦化混淆…….简单的举例下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
175:
0 LOAD_CONST 10 (10)
178:
0 STORE_FAST 2 (2)
247:
0 LOAD_CONST 9 (9)
44:
0 LOAD_FAST 2 (2)
47:
0 COMPARE_OP 2 (==)
50:
0 POP_JUMP_IF_TRUE 74
53:
0 LOAD_CONST 15 (15)
56:
0 LOAD_FAST 2 (2)
59:
0 COMPARE_OP 2 (==)
62:
0 POP_JUMP_IF_TRUE 77
65:
0 LOAD_CONST 5 (5)
68:
0 LOAD_FAST 2 (2)
596:
0 COMPARE_OP 2 (==)
599:
0 POP_JUMP_IF_TRUE 626
602:
0 LOAD_CONST 11 (11)
605:
0 LOAD_FAST 2 (2)
608:
0 COMPARE_OP 2 (==)
611:
0 POP_JUMP_IF_TRUE 629
614:
0 LOAD_CONST 10 (10)
617:
0 LOAD_FAST 2 (2)
620:
0 COMPARE_OP 2 (==)
88:
0 POP_JUMP_IF_TRUE 115

解释下各个指令的含义:

  • LOAD_CONST 10 (10) ==> push co_consts[10]
  • STORE_FAST 2 (2) ==> pop co_varnames[2]
  • LOAD_FAST 2 (2) ==> push co_varnames[2]
  • COMPARE_OP 2 (==) ==> pop x1; pop x2; if x1 == x2: push 1 else: push 0 (该指令的操作数2表示栈上的两个数进行比较)
  • POP_JUMP_IF_TRUE 74 ==> pop x1; if x1: jmp 74

翻译成伪代码就是:

1
2
3
4
5
6
7
8
9
10
11
DIVIDER = co_consts[10]
if DIVIDER == co_consts[9]:
jmp 74
if DIVIDER == co_consts[15]:
jmp 77
if DIVIDER == co_consts[5]:
jmp 626
if DIVIDER == co_consts[11]:
jmp 629
if DIVIDER == co_consts[10]:
jmp 115

这个就是控制流平坦化混淆,中间有一堆垃圾代码,因为我怕时间来不及就没有写全自动换脚本,是半自动半手工做题,用脚本去掉 JUMP 混淆,把结果输出到文件中,然后用 ctrl+f ,去掉控制流平坦化混淆(之后会在我博客中放全自动脚本)

去掉混淆后的代码:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
283:
0 LOAD_GLOBAL 0 (0) TOP1
286:
0 LOAD_FAST 0 (0) TOP
289:
0 CALL_FUNCTION 1 CALL TOP1(TOP)
q(OOO000OOOOOO00OOO)
229:
0 STORE_FAST 1 (1)
OOOO000OO000OOOOO = q(OOO000OOOOOO00OOO)
232:
0 LOAD_FAST 1 (1)
235:
0 LOAD_CONST 1 (1)
336:
0 COMPARE_OP 2 (==)
222:
0 POP_JUMP_IF_FALSE 253
if OOOO000OO000OOOOO != "": JUMP 253
657:
0 LOAD_CONST 2 (2)
660:
0 PRINT_ITEM
661:
0 PRINT_NEWLINE
PRINT co_consts[2]
276:
0 LOAD_CONST 0 (0)
0 RETURN_VALUE
return 0
###
def u(OOO000OOOOOO00OOO):
OOOO000OO000OOOOO = q(OOO000OOOOOO00OOO)
if OOOO000OO000OOOOO == "":
print 'Wrong key!'
return 0
###

第一个分支我们可以翻译出上面的代码,然后把指令调到 253 ,在继续跑脚本:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
682:
0 LOAD_GLOBAL 1 (1)
685:
0 LOAD_FAST 1 (1)
688:
0 CALL_FUNCTION 1
691:
0 POP_JUMP_IF_FALSE 709
if r(OOOO000OO000OOOOO) == False: JMP 709
16:
0 LOAD_CONST 2 (2)
19:
0 PRINT_ITEM
20:
0 PRINT_NEWLINE
PRINT co_consts[2]
317:
0 LOAD_CONST 0 (0)
0 RETURN_VALUE
return 0

继续跟踪到 709 :

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
324:
0 LOAD_GLOBAL 2 (2)
327:
0 LOAD_FAST 1 (1)
330:
0 CALL_FUNCTION 1
241:
0 POP_JUMP_IF_FALSE 262
if p(OOOO000OO000OOOOO) == False: JMP 262
701:
0 LOAD_CONST 2 (2)
704:
0 PRINT_ITEM
705:
0 PRINT_NEWLINE
print co_consts[2]
10:
0 LOAD_CONST 0 (0)
0 RETURN_VALUE
return 0

根据到 262 :

1
2
3
4
5
6
7
8
9
10
11
12
24:
0 LOAD_CONST 3 (3)
27:
0 PRINT_ITEM
28:
0 PRINT_NEWLINE
311:
0 LOAD_CONST 0 (0)
0 RETURN_VALUE
print co_consts[3]
return 0

根据上面追踪翻译出来的代码,成功还原出 u 函数:

1
2
3
4
5
6
7
8
9
10
11
12
13
def u(OOO000OOOOOO00OOO):
OOOO000OO000OOOOO = q(OOO000OOOOOO00OOO)
if OOOO000OO000OOOOO == "":
print ERROR
return 0
if r(OOOO000OO000OOOOO):
print ERROR
return 0
if p(OOOO000OO000OOOOO):
print ERROR
return 0
print "Good job!"
return 0

如果输出Good job!则表示得到flag

所以下面就是取逆向q, r, p三个函数,原理和上面逆向u函数一样:

1
2
3
4
5
6
7
def q(O0OOO0O0O00O00OOO):
return O0OOO0O0O00O00OOO.decode('hex')
def r(O0O000OO00OO00OO0):
if len(O0O000OO00OO00OO0) == 18:
return 0
return 1

qr两个函数,一个是进行decode操作,一个是判断长度,所以判断flag是否正确就在p函数中,而p函数是手工最难逆的函数,我从下午 6 点,逆到了 8 点,/(ㄒoㄒ)/~~,我应该是采取了最笨的方法,前面提到了,我现在有个自动化的思路,之后会放到我 blog 中。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
def p(hhh):
if ((ord(hhh[13])*25+254)%256) ^ 184 == 139:
if ((ord(hhh[2])*3+158)%256) ^ 161 == 21:
if ((ord(hhh[10])*251+142)%256) ^ 128 == 115:
if ((ord(hhh[5])*99+28)%256) ^ 130 == 253:
if ((ord(hhh[17])*219+88)%256) ^ 130 == 180:
if ((ord(hhh[14])*83+119)%256) ^ 161 == 101:
if ((ord(hhh[7])*57+178)%256) ^ 184 == 91:
if ((ord(hhh[1])*245+207)%256) ^ 184 == 57:
if ((ord(hhh[0])*249+166)%256) ^ 230 == 85:
if ((ord(hhh[8])*213+134)%256) ^ 161 == 240:
if ((ord(hhh[4])*199+255)%256) ^ 128 == 202:
if ((ord(hhh[6])*85+30)%256) ^ 230 == 202:
if ((ord(hhh[9])*173+69)%256) ^ 227 == 124:
if ((ord(hhh[15])*141+205)%256) ^ 227 == 170:
if ((ord(hhh[11])*133+218)%256) ^ 130 == 149:
if ((ord(hhh[12])*139+193)%256) ^ 230 == 2:
if ((ord(hhh[3])*67+202)%256) ^ 227 == 24:
if ((ord(hhh[16])*103+151)%256) ^ 128 == 145:
return 0
return 1
# 这代码弄出来的时候差点猝死

然后写个脚本爆破出 flag (现在想想,应该可以用 z3 ):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
#! /usr/bin/env python
# -*- coding: utf-8 -*-
flag = [""]*18
bds = ['((x*25+254)%256) ^ 184 == 139', '((x*3+158)%256) ^ 161 == 21', '((x*251+142)%256) ^ 128 == 115', '((x*99+28)%256) ^ 130 == 253', '((x*219+88)%256) ^ 130 == 180', '((x*83+119)%256) ^ 161 == 101', '((x*57+178)%256) ^ 184 == 91', '((x*245+207)%256) ^ 184 == 57', '((x*249+166)%256) ^ 230 == 85', '((x*213+134)%256) ^ 161 == 240', '((x*199+255)%256) ^ 128 == 202', '((x*85+30)%256) ^ 230 == 202', '((x*173+69)%256) ^ 227 == 124', '((x*141+205)%256) ^ 227 == 170', '((x*133+218)%256) ^ 130 == 149', '((x*139+193)%256) ^ 230 == 2', '((x*67+202)%256) ^ 227 == 24', '((x*103+151)%256) ^ 128 == 145']
bds_index = [13,2,10,5,17,14,7,1,0,8,4,6,9,15,11,12,3,16]
for y in xrange(18):
for x in xrange(256):
if eval(bds[y]):
flag[bds_index[y]] = x
break
payload= ""
for x in flag:
payload += chr(x)
print payload.encode('hex')
# b5aab27b5d01d6b91f021f59c97ddf6c76fa
$ python final.pyc
Please input your key(hex string):b5aab27b5d01d6b91f021f59c97ddf6c76fa
Good job! The flag is pwnhub{flag:your input(lower case)}

傻逼的半自动化脚本:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
#!/usr/bin/env python2.7
# -*- coding=utf-8 -*-
import dis, marshal
from pwn import u16
one_para = ["\x01","\x02", "\x03", "\x21", "\x17", "\x0a", "\x0b", "\x0c", "\x0f", "\x13", "\x14", "\x15", "\x16", "\x18", "\x19", "\x1a", "\x1c", "\x1e", "\x1f", "\x47", "\x48"]
f = open("final.pyc")
f.read(8)
code = marshal.load(f)
code = code.co_consts[45]
print code.co_name
asm = code.co_code
# asm = code.co_consts[45].co_code
stack = []
varn = {
'DIVIDER': None, # DEVIDER
'OOO000OOOOOO00OOO': None,
'OOOO000OO000OOOOO': None,
'OOO0OOO00O00OOOO0': None,
'O0OOO0O0O00O00OOO': None,
'O0O000OO00OO00OO0': None,
}
flag = 0
# n = 1632
n = 0
add = 0
while True:
if len(asm) <= n:
print n
break
if asm[n] == 'q': # JUMP_ABSOLUTE
n = u16(asm[n+1:n+3])
continue
elif asm[n] == 'n': # JUMP_FORWARD
n += u16(asm[n+1:n+3]) + 3
continue
# elif asm[n] in one_para:
# dis.disassemble_string(asm[n])
# n+=1
# continue
elif asm[n] == "\x53": # RETURN
dis.disassemble_string(asm[n])
break
try:
print "%d: "%n
dis.disassemble_string(asm[n:n+3])
add = 3
except IndexError:
try:
dis.disassemble_string(asm[n])
add = 1
except IndexError:
print "%d: %d"%(n, ord(asm[n]))
break
if asm[n] == 'd': # LOAD_CONST
key = u16(asm[n+1:n+3])
value = code.co_consts[key]
stack.append(value)
n += 3
elif asm[n] == '|': # LOAD_FAST
key = u16(asm[n+1:n+3])
value = varn[code.co_varnames[key]]
stack.append(value)
n += 3
elif asm[n] == '}': # STORE_FAST
key = code.co_varnames[u16(asm[n+1:n+3])]
varn[key] = stack.pop()
n += 3
elif asm[n] == 'k': # COMPARE_OP
x1 = stack.pop()
x2 = stack.pop()
if x1 == x2:
flag = 1
n += 3
elif asm[n] == "s": # POP_JUMP_IF_TRUE
if flag:
n = u16(asm[n+1:n+3])
flag = 0
else:
n += 3
# elif ord(asm[n]) == 114: # POP_JUMP_IF_FALSE
# break
else:
n += add

简直 6 ,你学会了吗?