Skip to content

Commit 5e353a8

Browse files
authored
Merge pull request #1307 from kernelkit/flaky
2 parents c35342b + e596ffc commit 5e353a8

File tree

5 files changed

+177
-112
lines changed

5 files changed

+177
-112
lines changed

doc/support.md

Lines changed: 91 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,91 @@
1+
# Support Data Collection
2+
3+
When troubleshooting issues or seeking support, the `support` command
4+
provides a convenient way to collect comprehensive system diagnostics.
5+
This command gathers configuration files, logs, network state, and other
6+
system information into a single compressed archive.
7+
8+
## Collecting Support Data
9+
10+
To collect support data and save it to a file:
11+
12+
```bash
13+
admin@host:~$ support collect > support-data.tar.gz
14+
(admin@host) Password: ***********
15+
Starting support data collection from host...
16+
This may take up to a minute. Please wait...
17+
Tailing /var/log/messages for 30 seconds (please wait)...
18+
Log tail complete.
19+
Collection complete. Creating archive...
20+
admin@host:~$ ls -l support-data.tar.gz
21+
-rw-rw-r-- 1 admin admin 508362 nov 30 13:05 support-data.tar.gz
22+
```
23+
24+
The command can also be run remotely via SSH from your workstation:
25+
26+
```bash
27+
$ ssh admin@host support collect > support-data.tar.gz
28+
...
29+
```
30+
31+
The collection process may take up to a minute depending on system load
32+
and the amount of logging data. Progress messages are shown during the
33+
collection process.
34+
35+
## Encrypted Collection
36+
37+
For secure transmission of support data, the archive can be encrypted
38+
with GPG using a password:
39+
40+
```bash
41+
admin@host:~$ support collect -p mypassword > support-data.tar.gz.gpg
42+
Starting support data collection from host...
43+
This may take up to a minute. Please wait...
44+
...
45+
Collection complete. Creating archive...
46+
Encrypting with GPG...
47+
```
48+
49+
The `support collect` command even supports omitting `mypassword` and
50+
will then prompt interactively for the password. This works over SSH too,
51+
but the local ssh client may then echo the password.
52+
53+
> [!TIP]
54+
> To hide the encryption password for an SSH session, the script supports
55+
> reading from stdin:
56+
> `echo "$MYSECRET" | ssh user@device support collect -p >
57+
> file.tar.gz.gpg`
58+
59+
After transferring the resulting file to your workstation, decrypt it
60+
with the password:
61+
62+
```bash
63+
$ gpg -d support-data.tar.gz.gpg > support-data.tar.gz
64+
$ tar xzf support-data.tar.gz
65+
...
66+
```
67+
68+
or
69+
70+
```bash
71+
$ gpg -d support-data.tar.gz.gpg | tar xz
72+
...
73+
```
74+
75+
> [!IMPORTANT]
76+
> Make sure to share `mypassword` out-of-band from the encrypted data
77+
> with the recipient of the data. I.e., avoid sending both in the same
78+
> plain-text email for example.
79+
80+
## What is Collected
81+
82+
The support archive includes:
83+
84+
- System identification (hostname, uptime, kernel version)
85+
- Running and operational configuration (sysrepo datastores)
86+
- System logs (`/var/log` directory and live tail of messages log)
87+
- Network configuration and state (interfaces, routes, neighbors, bridges)
88+
- FRRouting information (OSPF, BFD status)
89+
- Container information (podman containers and their configuration)
90+
- System resource usage (CPU, memory, disk, processes)
91+
- Hardware information (PCI, USB devices, network interfaces)

doc/system.md

Lines changed: 0 additions & 88 deletions
Original file line numberDiff line numberDiff line change
@@ -323,94 +323,6 @@ reference ID, stratum, time offsets, frequency, and root delay.
323323
> The system uses `chronyd` Network Time Protocol (NTP) daemon. The
324324
> output shown here is best explained in the [Chrony documentation][4].
325325
326-
## Support Data Collection
327-
328-
When troubleshooting issues or seeking support, the `support` command
329-
provides a convenient way to collect comprehensive system diagnostics.
330-
This command gathers configuration files, logs, network state, and other
331-
system information into a single compressed archive.
332-
333-
### Collecting Support Data
334-
335-
To collect support data and save it to a file:
336-
337-
```bash
338-
admin@host:~$ support collect > support-data.tar.gz
339-
(admin@host) Password: ***********
340-
Starting support data collection from host...
341-
This may take up to a minute. Please wait...
342-
Tailing /var/log/messages for 30 seconds (please wait)...
343-
Log tail complete.
344-
Collection complete. Creating archive...
345-
admin@host:~$ ls -l support-data.tar.gz
346-
-rw-rw-r-- 1 admin admin 508362 nov 30 13:05 support-data.tar.gz
347-
```
348-
349-
The command can also be run remotely via SSH from your workstation:
350-
351-
```bash
352-
$ ssh admin@host support collect > support-data.tar.gz
353-
...
354-
```
355-
356-
The collection process may take up to a minute depending on system load
357-
and the amount of logging data. Progress messages are shown during the
358-
collection process.
359-
360-
### Encrypted Collection
361-
362-
For secure transmission of support data, the archive can be encrypted
363-
with GPG using a password:
364-
365-
```bash
366-
admin@host:~$ support collect -p mypassword > support-data.tar.gz.gpg
367-
Starting support data collection from host...
368-
This may take up to a minute. Please wait...
369-
...
370-
Collection complete. Creating archive...
371-
Encrypting with GPG...
372-
```
373-
374-
The `support collect` command even supports omitting `mypassword` and
375-
will then prompt interactively for the password. This works over SSH too,
376-
but the local ssh client may then echo the password.
377-
378-
> [!TIP]
379-
> To hide the encryption password for an SSH session, the script supports reading from stdin:
380-
> `echo "$MYSECRET" | ssh user@device support collect -p > file.tar.gz.gpg`
381-
382-
After transferring the resulting file to your workstation, decrypt it
383-
with the password:
384-
385-
```bash
386-
$ gpg -d support-data.tar.gz.gpg > support-data.tar.gz
387-
$ tar xzf support-data.tar.gz
388-
```
389-
390-
or
391-
392-
```bash
393-
$ gpg -d support-data.tar.gz.gpg | tar xz
394-
```
395-
396-
> [!IMPORTANT]
397-
> Make sure to share `mypassword` out-of-band from the encrypted data
398-
> with the recipient of the data. I.e., avoid sending both in the same
399-
> plain-text email for example.
400-
401-
### What is Collected
402-
403-
The support archive includes:
404-
405-
- System identification (hostname, uptime, kernel version)
406-
- Running and operational configuration (sysrepo datastores)
407-
- System logs (`/var/log` directory and live tail of messages log)
408-
- Network configuration and state (interfaces, routes, neighbors, bridges)
409-
- FRRouting information (OSPF, BFD status)
410-
- Container information (podman containers and their configuration)
411-
- System resource usage (CPU, memory, disk, processes)
412-
- Hardware information (PCI, USB devices, network interfaces)
413-
414326
[1]: https://www.rfc-editor.org/rfc/rfc7317
415327
[2]: https://github.com/kernelkit/infix/blob/main/src/confd/yang/infix-system%402024-02-29.yang
416328
[3]: https://www.rfc-editor.org/rfc/rfc8341

mkdocs.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -43,6 +43,7 @@ nav:
4343
- Hardware Info & Status: hardware.md
4444
- Management: management.md
4545
- Syslog Support: syslog.md
46+
- Support Data: support.md
4647
- Upgrade: upgrade.md
4748
- Scripting:
4849
- Introduction: scripting.md

src/bin/support

Lines changed: 55 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,9 @@
11
#!/bin/sh
22
# Support utilities for troubleshooting Infix systems
3+
4+
# Program name for usage messages (supports being renamed by users)
5+
prognm=$(basename "$0")
6+
37
#
48
# The collect command gathers system information and outputs a tarball.
59
# Data is collected to /var/lib/support (or $HOME as fallback) and then
@@ -9,35 +13,36 @@
913
# systems that do not yet have this script in the root fileystems.
1014
#
1115
# 1. Copy this script to the target device's home directory:
12-
# scp support user@device:
16+
#
17+
# scp support user@device:
1318
#
1419
# 2. SSH to the device and make it executable:
15-
# ssh user@device
16-
# chmod +x ~/support
20+
#
21+
# ssh user@device chmod +x support
1722
#
1823
# 3. Run the script from your home directory:
1924
#
20-
# ~/support collect > support-data.tar.gz
25+
# ./support collect > support-data.tar.gz
2126
#
2227
# Or directly via SSH from your workstation:
2328
#
24-
# ssh user@device '~/support collect' > support-data.tar.gz
29+
# ssh user@device './support collect' > support-data.tar.gz
2530
#
2631
# Optionally, the output can be encrypted with GPG using a password for
2732
# secure transmission to support personnel, see below.
2833
#
2934
# Examples:
30-
# support collect > support-data.tar.gz
31-
# support collect -s 5 > support-data.tar.gz
32-
# support collect -p > support-data.tar.gz.gpg
33-
# support collect -p mypass > support-data.tar.gz.gpg
35+
# ./support collect > support-data.tar.gz
36+
# ./support collect -s 5 > support-data.tar.gz
37+
# ./support collect -p > support-data.tar.gz.gpg
38+
# ./support collect -p mypass > support-data.tar.gz.gpg
3439
#
35-
# ssh user@device support collect > support-data.tar.gz
36-
# ssh user@device support collect -p mypass > support-data.tar.gz.gpg
40+
# ssh user@device ./support collect > support-data.tar.gz
41+
# ssh user@device ./support collect -p mypass > support-data.tar.gz.gpg
3742
#
3843
# Note, interactive password prompt (-p without argument) may echo characters
3944
# over SSH due to local terminal echo. Use -p PASSWORD for remote execution,
40-
# or pipe the password: echo "password" | ssh user@device support collect -p
45+
# or pipe the password: echo "password" | ssh user@device ./support collect -p
4146
# meaning you can even: echo "$SECRET_VARIABLE" | ... which in some cases can
4247
# come in handy.
4348
#
@@ -91,7 +96,7 @@ cmd_collect()
9196
;;
9297
*)
9398
echo "Error: Unknown option '$1'" >&2
94-
echo "Usage: support collect [--log-sec|-s N] [--password|-p PASSWORD]" >&2
99+
echo "Usage: $prognm collect [--log-sec|-s N] [--password|-p PASSWORD]" >&2
95100
exit 1
96101
;;
97102
esac
@@ -129,8 +134,12 @@ cmd_collect()
129134
# Cleanup on exit
130135
cleanup()
131136
{
137+
echo "[$(date -Iseconds)] Cleanup called (signal: ${1:-EXIT})" >> "${EXEC_LOG}" 2>&1 || echo "[$(date -Iseconds)] Cleanup called (signal: ${1:-EXIT})" >&2
132138
if [ -d "${COLLECT_DIR}" ]; then
139+
echo "[$(date -Iseconds)] Removing collection directory: ${COLLECT_DIR}" >> "${EXEC_LOG}" 2>&1 || echo "[$(date -Iseconds)] Removing: ${COLLECT_DIR}" >&2
133140
rm -rf "${COLLECT_DIR}"
141+
else
142+
echo "[$(date -Iseconds)] Collection directory already gone: ${COLLECT_DIR}" >> "${EXEC_LOG}" 2>&1 || echo "[$(date -Iseconds)] Already gone: ${COLLECT_DIR}" >&2
134143
fi
135144
}
136145
trap cleanup EXIT INT TERM
@@ -383,7 +392,14 @@ cmd_collect()
383392

384393
# Create final tar.gz and output to stdout
385394
# Use -C to change to parent directory so paths in archive don't include full path
386-
cd "${WORK_DIR}"
395+
echo "[$(date -Iseconds)] Changing to work directory: ${WORK_DIR}" >> "${EXEC_LOG}" 2>&1
396+
if ! cd "${WORK_DIR}"; then
397+
echo "[$(date -Iseconds)] ERROR: Failed to cd to ${WORK_DIR}" >> "${EXEC_LOG}" 2>&1
398+
echo "Error: Cannot change to work directory ${WORK_DIR}" >&2
399+
exit 1
400+
fi
401+
echo "[$(date -Iseconds)] Successfully changed to: $(pwd)" >> "${EXEC_LOG}" 2>&1
402+
echo "[$(date -Iseconds)] Creating archive from: $(basename "${COLLECT_DIR}")" >> "${EXEC_LOG}" 2>&1
387403

388404
# Check if password encryption is requested
389405
if [ -n "$PASSWORD" ]; then
@@ -392,14 +408,29 @@ cmd_collect()
392408
exit 1
393409
fi
394410
echo "Encrypting with GPG..." >&2
411+
echo "[$(date -Iseconds)] Starting tar with GPG encryption" >> "${EXEC_LOG}" 2>&1
395412
tar czf - "$(basename "${COLLECT_DIR}")" 2>> "${EXEC_LOG}" | \
396413
gpg --batch --yes --passphrase "$PASSWORD" --pinentry-mode loopback -c 2>> "${EXEC_LOG}"
414+
tar_exit=$?
415+
echo "[$(date -Iseconds)] tar+gpg pipeline exit code: $tar_exit" >> "${EXEC_LOG}" 2>&1
397416
echo "" >&2
398417
echo "WARNING: Remember to share the encryption password out-of-band!" >&2
399418
echo " Do not send it in the same email as the encrypted file." >&2
419+
if [ $tar_exit -ne 0 ]; then
420+
echo "[$(date -Iseconds)] ERROR: tar+gpg failed with exit code $tar_exit" >> "${EXEC_LOG}" 2>&1
421+
exit $tar_exit
422+
fi
400423
else
424+
echo "[$(date -Iseconds)] Starting tar (no encryption)" >> "${EXEC_LOG}" 2>&1
401425
tar czf - "$(basename "${COLLECT_DIR}")" 2>> "${EXEC_LOG}"
426+
tar_exit=$?
427+
echo "[$(date -Iseconds)] tar exit code: $tar_exit" >> "${EXEC_LOG}" 2>&1
428+
if [ $tar_exit -ne 0 ]; then
429+
echo "[$(date -Iseconds)] ERROR: tar failed with exit code $tar_exit" >> "${EXEC_LOG}" 2>&1
430+
exit $tar_exit
431+
fi
402432
fi
433+
echo "[$(date -Iseconds)] Archive creation completed successfully" >> "${EXEC_LOG}" 2>&1
403434
}
404435

405436
cmd_clean()
@@ -424,7 +455,7 @@ cmd_clean()
424455
;;
425456
*)
426457
echo "Error: Unknown option '$1'" >&2
427-
echo "Usage: support clean [--dry-run] [--days N]" >&2
458+
echo "Usage: $prognm clean [--dry-run] [--days N]" >&2
428459
exit 1
429460
;;
430461
esac
@@ -483,7 +514,7 @@ cmd_clean()
483514

484515
usage()
485516
{
486-
echo "Usage: support [global-options] <command> [options]"
517+
echo "Usage: $prognm [global-options] <command> [options]"
487518
echo ""
488519
echo "Global options:"
489520
echo " -w, --work-dir PATH Use PATH as working directory for collection/cleanup"
@@ -505,14 +536,14 @@ usage()
505536
echo " -d, --days N Remove directories older than N days (default: 7)"
506537
echo ""
507538
echo "Examples:"
508-
echo " support collect > support-data.tar.gz"
509-
echo " support collect -p > support-data.tar.gz.gpg"
510-
echo " support collect --password mypass > support-data.tar.gz.gpg"
511-
echo " support --work-dir /tmp/ram collect > support-data.tar.gz"
512-
echo " ssh user@device support collect > support-data.tar.gz"
513-
echo " support clean --dry-run"
514-
echo " support clean --days 30"
515-
echo " support --work-dir /tmp/ram clean"
539+
echo " $prognm collect > support-data.tar.gz"
540+
echo " $prognm collect -p > support-data.tar.gz.gpg"
541+
echo " $prognm collect --password mypass > support-data.tar.gz.gpg"
542+
echo " $prognm --work-dir /tmp/ram collect > support-data.tar.gz"
543+
echo " ssh user@device $prognm collect > support-data.tar.gz"
544+
echo " $prognm clean --dry-run"
545+
echo " $prognm clean --days 30"
546+
echo " $prognm --work-dir /tmp/ram clean"
516547
exit 1
517548
}
518549

0 commit comments

Comments
 (0)