Troubleshooting
- my gist commands.sh
- https://github.com/alex/what-happens-when#dns-lookup 浏览器如何访问google
- https://medium.com/meatandmachines/what-really-happens-when-you-type-ls-l-in-the-shell-a8914950fd73 fork/ exec/ wait
- shell check "ls" is an alias or built-in function firstly
- The environment is copied and passed to the new process
- https://askubuntu.com/questions/525767/what-does-an-exec-command-do
exec > file
, so that all output is redirected tofile
- https://askubuntu.com/questions/280342/why-do-df-and-du-commands-show-different-disk-usage you can remove a file that still in use by some application and for this application it remains available. It because file descriptor in /proc/ filesystem is held open. # lsof | grep '(deleted)'
- Troubleshooting Apache
- https://www.digitalocean.com/community/tutorials/how-to-troubleshoot-common-site-issues-on-a-linux-server
- access log check connectivity/user side error; server log check server side error
- check file, directory, soft links permissions
- debug core dump file's backtrace: gdb /usr/bin/myapp.binary corefile; bt full
- Apache source file can be modified
- check normal system call stack: strace stat foo
- How to implement traceroute? Hack of TCP: TCP timeout will show the Dest IP failed to response
- How to build a yum package? yum registered meta info in yum repo
- CPU wait time is a sub-category of idle time
- CPU never spends clock cycles waiting for an I/O operation to complete. Instead, if a task running on a given CPU blocks on a synchronous I/O operation, the kernel will suspend that task and allow other tasks to be scheduled on that CPU.
- For a given CPU, the I/O wait time is the time during which that CPU was idle (i.e. didn’t execute any tasks) and there was at least one outstanding disk I/O operation requested by a task scheduled on that CPU (at the time it generated that I/O request).
- Perf system C*
- IO频繁的解决方法 (high system CPU time and a large amount of context switches and interrupts observed in vmstat/dstat)
- Reserve a core or two for interrupt processing. This is common in real-time use cases such as music production or high-frequency trading. (One of the difficult-to-observe benefits of a reserved CPU core is that the kernel's code can stay hot in cache on that core. The reserved core may never surpass 10% utilization, but the latency benefits are sometimes worth it.)
taskset -apc 2-7 $CASS_PID taskset -c 2-7 ./cassandra -f
- The other is to evenly distribute interrupts over the cores in a system.
- 一般流程
- top, htop, atop check system level performance
- dmesg, tail /var/log/syslog
- strace -p
- rm -f
- rm -- -f
- rm ./-f
- Defusing fork bomb: e.g., :(){:|: &;};:
- some built-in shell commands are still available to execute
- exec bash脚本: kill -9 -GPID; ulimit -u 1000
- What if "can't umount device or resource busy"? (kill processes, kill NFS kernel thread client)
- one common antipattern: a spurious spike change the type of work (偶然的高并发使大量数据被buffer并没有被释放)
- Context switch is high: usually high I/O, batch I/Os
没有评论:
发表评论