文章目录
  1. 1. 前言
  2. 2. 案例一
  3. 3. 案例二
  4. 4. 思考
  5. 5. 参考阅读

前言

LLDB是LLVM项目里面一个使用非常广泛的调试器,配合debugserver,对本地调试或者远程调试都有很好的支持。

网上也有很多关于这两个程序的介绍,这里就不多说了,本文主要记录了我在实际工作中遇到的两个例子,利用LLDB的调试功能,在没有代码的情况下,定位Crash的具体符号。通过这两个例子,可以对这一类问题提供通用的思路:找到某个功能调用的具体符号,方便排查问题,或者开发对相关符号的钩子(Hook)功能

案例一

某些特殊情况下,App运行后主动退出,在没有代码的情况下,我们不清楚App主动退出的原因,这个时候可以借助LLDB,通过attach命令,调试程序,并打上一些猜测的断点,进一步观察情况。 下面以进程名字DACS_ARM_Tool为例

通过attach进程名的方式,等待进程启动。

attach -n DACS_ARM_Tool --waitfor

运行程序,可以看到lldb如下输入:

(lldb) attach -n DACS_ARM_Tool --waitfor
Process 30772 stopped
* thread #1, stop reason = signal SIGSTOP
frame #0: 0x000000010f6b062a dyld`stat64 + 10
dyld`stat64:
-> 0x10f6b062a <+10>: jae 0x10f6b0634 ; <+20>
0x10f6b062c <+12>: movq %rax, %rdi
0x10f6b062f <+15>: jmp 0x10f6ae408 ; cerror_nocancel
0x10f6b0634 <+20>: retq
Target 0: (DACS_ARM_Tool) stopped.

上面显示进程中断运行,目前执行到1号线程,中断原因是signal SIGSTOP。

在这里我们可以思考一下,App如果是主动退出,一般会调用哪些符号,常见的有-[NSApplication terminate:]exit等,所以先对这两个符号打上断点。

lldb打断点的命令是breakpoint set,缩写是br s,后面跟上符号的类型以及符号名字,-n表示符号名,-r表示正则匹配,具体用法参考文末官方文档。

(lldb) br s -n exit
Breakpoint 1: where = Foundation`+[NSThread exit], address = 0x00007fff360a8a10
(lldb) br s -n "-[NSApplication terminate:]"
Breakpoint 2: where = AppKit`-[NSApplication terminate:], address = 0x00007fff30ec2a7f

接着输入命令c,让程序恢复运行,此时可以输入br list命令,可以看到当前识别到的断点位置:

(lldb) br list
Current breakpoints:
1: name = 'exit', locations = 3, resolved = 3, hit count = 1
1.1: where = Foundation`+[NSThread exit], address = 0x00007fff360a8a10, resolved, hit count = 1
1.2: where = libsystem_c.dylib`exit, address = 0x00007fff6db323db, resolved, hit count = 0
1.3: where = Security`Security::CountingMutex::exit(), address = 0x00007fff4028afe8, resolved, hit count = 0

2: name = '-[NSApplication terminate:]', locations = 1, resolved = 1, hit count = 0
2.1: where = AppKit`-[NSApplication terminate:], address = 0x00007fff30ec2a7f, resolved, hit count = 0

可以看到一共断点到4个位置,继续运行程序,出发程序退出逻辑,这样就可以看到程序在退出之前被中断了。接着使用bt命令查看调用栈,一般就可以看到一些原因了。

Process 35564 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 4.1
frame #0: 0x00007fff30ec2a7f AppKit`-[NSApplication terminate:]
AppKit`-[NSApplication terminate:]:
-> 0x7fff30ec2a7f <+0>: pushq %rbp
0x7fff30ec2a80 <+1>: movq %rsp, %rbp
0x7fff30ec2a83 <+4>: pushq %r15
0x7fff30ec2a85 <+6>: pushq %r14
Target 0: (DACS_ARM_Tool) stopped.
(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 4.1
* frame #0: 0x00007fff30ec2a7f AppKit`-[NSApplication terminate:]
frame #1: 0x000000010bf5add2 DACS_ARM_Tool`-[ViewController testCrash:](self=0x00007fa73d81e320, _cmd="testCrash:", sender=0x00007fa73c5b3450) at ViewController.mm:406:5
frame #2: 0x00007fff30e377a7 AppKit`-[NSApplication(NSResponder) sendAction:to:from:] + 299
frame #3: 0x00007fff30e37642 AppKit`-[NSControl sendAction:to:] + 86
frame #4: 0x00007fff30e37574 AppKit`__26-[NSCell _sendActionFrom:]_block_invoke + 136
frame #5: 0x00007fff30e37476 AppKit`-[NSCell _sendActionFrom:] + 171
frame #6: 0x00007fff30e373bd AppKit`-[NSButtonCell _sendActionFrom:] + 96
frame #7: 0x00007fff30e3369b AppKit`NSControlTrackMouse + 1745
frame #8: 0x00007fff30e32fa2 AppKit`-[NSCell trackMouse:inRect:ofView:untilMouseUp:] + 130
frame #9: 0x00007fff30e32e61 AppKit`-[NSButtonCell trackMouse:inRect:ofView:untilMouseUp:] + 691
frame #10: 0x00007fff30e321dd AppKit`-[NSControl mouseDown:] + 748
frame #11: 0x00007fff30e305f0 AppKit`-[NSWindow(NSEventRouting) _handleMouseDownEvent:isDelayedEvent:] + 4914
frame #12: 0x00007fff30d9ae21 AppKit`-[NSWindow(NSEventRouting) _reallySendEvent:isDelayedEvent:] + 2612
frame #13: 0x00007fff30d9a1c9 AppKit`-[NSWindow(NSEventRouting) sendEvent:] + 349
frame #14: 0x00007fff30d98554 AppKit`-[NSApplication(NSEvent) sendEvent:] + 352
frame #15: 0x00007fff30be55bf AppKit`-[NSApplication run] + 707
frame #16: 0x00007fff30bb7396 AppKit`NSApplicationMain + 777
frame #17: 0x000000010bf67a6f DACS_ARM_Tool`main(argc=1, argv=0x00007ffee3caf538) at main.m:14:12
frame #18: 0x00007fff6da88cc9 libdyld.dylib`start + 1
frame #19: 0x00007fff6da88cc9 libdyld.dylib`start + 1

上面lldb的输出可以看到,程序进入testCrash方法,调用了terminate:导致的退出。

上面就是一个完整的分析例子,例子比较简单,所以很快就可以看到原因。实际开发中一般会遇到更加复杂的情况,比如除了上面打断的两个退出符号之外,还有其他符号,又或者程序退出中途又会调用多个名为exit的函数,这样使用bt是看不到正确的调用栈的,需要慢慢分析。

案例二

某次对login进程注入了一个动态库之后,发现login无法启动,控制台出现了一句话之后Crash。

2021-10-09 16:07:52.355 login[63473:8543240] The application with bundle ID (null) is running setugid(), which is not allowed. Exiting.

从字面看,意思就是程序没有bundle id导致无法调用setugid,这个可能是我注入的动态库中运行了某些hook类的代码导致的,但是因为代码很多,一行一行注释排查需要花费很长时间,如果能借助LLDB,通过在某些符号上打断点,能确定出问题的地方,那效率肯定能提高一些。

开始排查。还是和上面一样进入LLDB,通过attach进程名字:

attach -n login --waitfor

在终端里单独运行login程序,发现lldb能attach成功并且中断了程序:

(lldb) attach -n login --waitfor
Process 68174 stopped
* thread #1, stop reason = signal SIGSTOP
frame #0: 0x000000011b9de000 dyld`_dyld_start
dyld`_dyld_start:
-> 0x11b9de000 <+0>: popq %rdi
0x11b9de001 <+1>: pushq $0x0
0x11b9de003 <+3>: movq %rsp, %rbp
0x11b9de006 <+6>: andq $-0x10, %rsp
Target 0: (login) stopped.

因为这次我不清楚具体触发Crash的符号名,所以这里使用正则匹配的形式,模糊匹配符号:

(lldb) br s -r "bundle"
Breakpoint 5: no locations (pending).

还没显示断点信息,恢复程序,就可以看到程序运行一会马上中断了,这一次可以看到断点信息了,一共有300多个断点:

(lldb) c
Process 68174 resuming
339 locations added to breakpoint 5
Process 68174 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 5.19
frame #0: 0x00007fff6dcc1e44 libxpc.dylib`+[OS_xpc_bundle load]
libxpc.dylib`+[OS_xpc_bundle load]:
-> 0x7fff6dcc1e44 <+0>: retq

libxpc.dylib`+[OS_xpc_service_instance load]:
0x7fff6dcc1e45 <+0>: retq

libxpc.dylib`+[OS_xpc_activity load]:
0x7fff6dcc1e46 <+0>: retq

libxpc.dylib`+[OS_xpc_file_transfer load]:
0x7fff6dcc1e47 <+0>: retq
Target 0: (login) stopped.

可以看到程序断点在符号+[OS_xpc_bundle load] 里面。通过br list命令,可以看到所有的断点信息(数量太多了,摘取其中一部分放上来)

(lldb) br list
Current breakpoints:
6: regex = 'bundle', locations = 339, resolved = 339, hit count = 0
6.8: where = libsystem_networkextension.dylib`ne_copy_cached_synthesized_uuids_for_bundle_identifier_locked, address = 0x00007fff6dc6bf3d, resolved, hit count = 0
6.9: where = libsystem_networkextension.dylib`__ne_copy_cached_uuids_for_bundle_identifier_block_invoke, address = 0x00007fff6dc6c058, resolved, hit count = 0
6.10: where = libsystem_networkextension.dylib`ne_copy_cached_bundle_identifier_for_synthesized_uuid_locked, address = 0x00007fff6dc6c228, resolved, hit count = 0
6.11: where = libsystem_networkextension.dylib`__ne_copy_cached_bundle_identifier_for_uuid_block_invoke, address = 0x00007fff6dc6c307, resolved, hit count = 0
6.12: where = libsystem_networkextension.dylib`__ne_copy_cached_bundle_identifier_for_uuid_block_invoke_2, address = 0x00007fff6dc6c39f, resolved, hit count = 0
6.13: where = libsystem_networkextension.dylib`__ne_copy_cached_bundle_identifier_for_synthesized_uuid_locked_block_invoke, address = 0x00007fff6dc6d7f7, resolved, hit count = 0
6.14: where = libsystem_networkextension.dylib`ne_copy_cached_bundle_identifier_for_uuid, address = 0x00007fff6dc6c0f9, resolved, hit count = 0
6.15: where = libsystem_networkextension.dylib`ne_copy_cached_preferred_bundle_for_bundle_identifier, address = 0x00007fff6dc6c436, resolved, hit count = 0
6.16: where = libsystem_networkextension.dylib`ne_copy_cached_uuids_for_bundle_identifier, address = 0x00007fff6dc67836, resolved, hit count = 0
6.17: where = libsystem_sandbox.dylib`gpu_bundle_find_trusted, address = 0x00007fff6dc93d10, resolved, hit count = 0
6.18: where = libsystem_sandbox.dylib`gpu_bundle_is_path_trusted, address = 0x00007fff6dc93ef3, resolved, hit count = 0
6.19: where = libxpc.dylib`+[OS_xpc_bundle load], address = 0x00007fff6dcc1e44, resolved, hit count = 0
部分省略。。。
部分省略。。。
6.73: where = Foundation`-[NSBundle bundleIdentifier], address = 0x00007fff35fc955c, resolved, hit count = 0
6.74: where = Foundation`+[NSBundle bundleWithIdentifier:], address = 0x00007fff35fc9d5d, resolved, hit count = 0
6.75: where = Foundation`+[NSBundle bundleWithPath:], address = 0x00007fff35fc9ebc, resolved, hit count = 0
6.76: where = Foundation`+[NSBundle bundleForClass:], address = 0x00007fff35fe7226, resolved, hit count = 0
6.77: where = Foundation`-[__NSBundleTables bundleForClass:], address = 0x00007fff35fe742d, resolved, hit count = 0
6.78: where = Foundation`-[NSBundle bundleURL], address = 0x00007fff36052db5, resolved, hit count = 0
6.79: where = Foundation`+[NSBundle bundleWithURL:], address = 0x00007fff36065016, resolved, hit count = 0

可以看到正则匹配的符号实在是太多了,断点了几百个符号,很容易导致程序运行时随便就匹配到无关断点,很难排查问题。此时输入c,会看到程序又卡在另一个断点上了,所以必须先删除所有断点,重新设置一个更加详细的正则匹配的符号名。

通过下面命令删除所有断点:

br del

这里好像没有什么特别的技巧了,只能看看注入的动态库中有什么符号,或者什么符号和bundle ID有关系的,经过一系列测试,最后可以发现断点到bundleIdentifier这个符号上,能找到Crash的最后调用信息。

(lldb) br s -r bundleIdentifier
Breakpoint 7: 40 locations.

输入c继续运行,发现第二次c之后,程序crash了,所以可以判断crash的地方就是第二次触发断点的地方。

(lldb) c
Process 16886 resuming
Process 16886 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 8.1
frame #0: 0x00007fff35fc955c Foundation`-[NSBundle bundleIdentifier]
Foundation`-[NSBundle bundleIdentifier]:
-> 0x7fff35fc955c <+0>: pushq %rbp
0x7fff35fc955d <+1>: movq %rsp, %rbp
0x7fff35fc9560 <+4>: pushq %rbx
0x7fff35fc9561 <+5>: pushq %rax

所以在第二次触发断点之后,查看调用栈:

(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 8.1
* frame #0: 0x00007fff35fc955c Foundation`-[NSBundle bundleIdentifier]
frame #1: 0x00007fff30bb7805 AppKit`_NSCheckForIllegalSetugidApp + 157
frame #2: 0x00007fff30bd7251 AppKit`+[NSWindow initialize] + 60
frame #3: 0x00007fff6c8e2795 libobjc.A.dylib`CALLING_SOME_+initialize_METHOD + 17
frame #4: 0x00007fff6c8e30a6 libobjc.A.dylib`initializeNonMetaClass + 646
frame #5: 0x00007fff6c8e4062 libobjc.A.dylib`initializeAndMaybeRelock(objc_class*, objc_object*, mutex_tt<false>&, bool) + 214
frame #6: 0x00007fff6c8d4d45 libobjc.A.dylib`lookUpImpOrForward + 1072
frame #7: 0x00007fff6c8d4399 libobjc.A.dylib`_objc_msgSend_uncached + 73
frame #8: 0x00000001010c7a54 libDataCube.dylib`+[NSWindow(self=NSWindow, _cmd="load") load] at NSWindow+Logo.m:18:5
frame #9: 0x00007fff6c8d3393 libobjc.A.dylib`load_images + 1068
frame #10: 0x000000010780d26c dyld`dyld::notifySingle(dyld_image_states, ImageLoader const*, ImageLoader::InitializerTimingList*) + 418
frame #11: 0x0000000107820fe9 dyld`ImageLoader::recursiveInitialization(ImageLoader::LinkContext const&, unsigned int, char const*, ImageLoader::InitializerTimingList&, ImageLoader::UninitedUpwards&) + 475
frame #12: 0x0000000107820f66 dyld`ImageLoader::recursiveInitialization(ImageLoader::LinkContext const&, unsigned int, char const*, ImageLoader::InitializerTimingList&, ImageLoader::UninitedUpwards&) + 344
frame #13: 0x000000010781f0b4 dyld`ImageLoader::processInitializers(ImageLoader::LinkContext const&, unsigned int, ImageLoader::InitializerTimingList&, ImageLoader::UninitedUpwards&) + 188
frame #14: 0x000000010781f154 dyld`ImageLoader::runInitializers(ImageLoader::LinkContext const&, ImageLoader::InitializerTimingList&) + 82
frame #15: 0x000000010780d6a8 dyld`dyld::initializeMainExecutable() + 199
frame #16: 0x0000000107812bba dyld`dyld::_main(macho_header const*, unsigned long, int, char const**, char const**, char const**, unsigned long*) + 6667
frame #17: 0x000000010780c227 dyld`dyldbootstrap::start(dyld3::MachOLoaded const*, int, char const**, dyld3::MachOLoaded const*, unsigned long*) + 453
frame #18: 0x000000010780c025 dyld`_dyld_start + 37

真想大白了,是因为用到了NSWindow这个类,内部会调用方法_NSCheckForIllegalSetugidApp检查bundle id,因为login这个程序并不是一个普通的app结构,因此检查不通过,导致程序crash了。

知道了crash的原因,只需要把所有NSWindow相关的代码都移除掉,就可以正常运行被注入之后的login程序了。

思考

上面两个案例,都是通过LLDB来发现出现问题的具体位置,除了crash问题,其他问题,比如网络的,都可以通过类似的手法来找到程序实际调用了哪些符号,对症下药,找到最终解决问题的方案。其中LLDB还有很多功能,比如查看断点处的汇编指令d:

(lldb) d
Foundation`-[NSBundle bundleIdentifier]:
0x7fff35fc955c <+0>: pushq %rbp
0x7fff35fc955d <+1>: movq %rsp, %rbp
-> 0x7fff35fc9560 <+4>: pushq %rbx
0x7fff35fc9561 <+5>: pushq %rax
0x7fff35fc9562 <+6>: movq 0x558bccdf(%rip), %rsi ; "infoDictionary"
0x7fff35fc9569 <+13>: movq 0x557e4aa8(%rip), %rbx ; (void *)0x00007fff6c8d3800: objc_msgSend
0x7fff35fc9570 <+20>: callq *%rbx
0x7fff35fc9572 <+22>: movq 0x557e468f(%rip), %rcx ; (void *)0x00007fff8b1f2ce0: kCFBundleIdentifierKey
0x7fff35fc9579 <+29>: movq (%rcx), %rdx
0x7fff35fc957c <+32>: movq 0x558bbe8d(%rip), %rsi ; "objectForKey:"
0x7fff35fc9583 <+39>: movq %rax, %rdi
0x7fff35fc9586 <+42>: movq %rbx, %rax
0x7fff35fc9589 <+45>: addq $0x8, %rsp
0x7fff35fc958d <+49>: popq %rbx
0x7fff35fc958e <+50>: popq %rbp
0x7fff35fc958f <+51>: jmpq *%rax
0x7fff35fc9591 <+53>: nop
0x7fff35fc9592 <+54>: nop
0x7fff35fc9593 <+55>: nop
0x7fff35fc9594 <+56>: nop
0x7fff35fc9595 <+57>: nop
0x7fff35fc9596 <+58>: nop
0x7fff35fc9597 <+59>: nop
0x7fff35fc9598 <+60>: nop
0x7fff35fc9599 <+61>: nop

我对LLDB的其他功能也不是很熟悉,这里就不多说了。

参考阅读

LLDB官方使用教程简介

文章目录
  1. 1. 前言
  2. 2. 案例一
  3. 3. 案例二
  4. 4. 思考
  5. 5. 参考阅读