Skip to content

Conversation

@dguido
Copy link
Member

@dguido dguido commented Nov 29, 2025

Summary

Closes #14912

Current integration tests verify that VPN services start, but don't verify they work. This adds true E2E tests using Linux network namespaces to simulate a client connecting to the server on GitHub Actions runners.

Changes

  • Add tests/e2e/test-vpn-connectivity.sh - Main E2E test script (~450 lines)
  • Add tests/e2e/README.md - Documentation for running tests locally
  • Update integration-tests.yml - Run E2E tests after deployment
  • Delete tests/legacy-lxd/ - Replaced by new E2E tests (was dead code)
  • Rewrite tests/README.md - Cleaner, more practical documentation

What Gets Tested

Test What's Verified
WireGuard Handshake completes, ping to server VPN IP, DNS through tunnel
IPsec Certificate chain valid, service listening, ports reachable
Validation mobileconfig XML syntax, CA constraints

Architecture

┌─────────────────────────────────────────────────┐
│              GitHub Actions Runner              │
│                                                 │
│  ┌─────────────────┐    ┌─────────────────┐    │
│  │ Main Namespace  │    │ Client Namespace │    │
│  │   (VPN Server)  │────│   (VPN Client)   │    │
│  │                 │veth│                  │    │
│  │  wg0: 10.49.0.1 │    │  wg0: 10.49.0.x  │    │
│  │  dns: 172.16.0.1│    │                  │    │
│  └─────────────────┘    └─────────────────┘    │
└─────────────────────────────────────────────────┘

Test plan

  • CI passes on all matrix combinations (wireguard, ipsec, both)
  • Script passes shellcheck
  • Cleanup works correctly (no orphaned namespaces)

🤖 Generated with Claude Code

@dguido dguido requested a review from jackivanov as a code owner November 29, 2025 05:43
@claude

This comment has been minimized.

@dguido dguido force-pushed the e2e-vpn-connectivity-tests branch from 06ffa12 to 13374b8 Compare November 29, 2025 05:50
@claude

This comment has been minimized.

@dguido dguido force-pushed the e2e-vpn-connectivity-tests branch from 13374b8 to baab125 Compare November 29, 2025 05:54
Addresses #14912

Current integration tests verify that VPN services start, but don't verify
they actually work. This adds true E2E tests using Linux network namespaces
to simulate a client connecting to the server.

New tests verify:
- WireGuard handshake completes and tunnel is functional
- IPsec/StrongSwan service is configured and listening
- DNS resolution works through VPN (172.16.0.1)
- mobileconfig XML files are valid
- CA certificate chain is correct

Changes:
- Add tests/e2e/test-vpn-connectivity.sh - main E2E test script
- Add tests/e2e/README.md - documentation for running tests
- Update integration-tests.yml to run E2E tests after deployment
- Delete tests/legacy-lxd/ - replaced by new E2E tests
- Update .ansible-lint to remove legacy-lxd from excludes
- Rewrite tests/README.md for clarity

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
@dguido dguido force-pushed the e2e-vpn-connectivity-tests branch from baab125 to e9acb12 Compare November 29, 2025 05:56
@claude

This comment has been minimized.

@claude

This comment has been minimized.

The namespace test was timing out because the firewall was blocking
UDP traffic on the veth interface. This adds explicit INPUT rules
to allow WireGuard (51820) and IPsec (500, 4500) traffic.

Also refines the MASQUERADE rule to not apply to bridge-local traffic.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
@claude

This comment has been minimized.

The firewall rules were being appended (-A) after existing DROP rules
and never matched. Changed to -I to insert at beginning of chain.

Also added debug output to show:
- Server WireGuard peers before client connects
- Server port listening status
- iptables INPUT chain on timeout (to verify rules)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
@claude

This comment was marked as outdated.

The async role execution in server.yml causes handlers not to fire
properly. This workaround restarts WireGuard if no peers are found,
ensuring the peer configuration is loaded.

Root cause: import_role with async: 300, poll: 0 breaks handler
notification flow. The 'restart wireguard' handler is notified but
never executed because the async context loses track of handlers.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
@claude

This comment was marked as outdated.

…hake

- Disable reverse path filtering on veth interface (can drop packets)
- Add tcpdump capture to see if UDP packets are arriving
- Show host and namespace routing tables
- Add route debugging to error output

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
@claude

This comment was marked as outdated.

dguido and others added 2 commits November 29, 2025 01:27
WireGuard only initiates a handshake when there's outgoing traffic or
a keepalive timer fires. Without PersistentKeepalive, the test was
waiting forever because no traffic was being sent through the tunnel
(Table=off prevents route creation).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
Remove routing table and rp_filter debug output that was printed on every
run. Keep the packet capture and detailed error diagnostics that are only
shown on failure.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
@claude

This comment was marked as outdated.

@claude

This comment was marked as outdated.

Add explicit documentation about the hardcoded IP addresses and test user
requirements as suggested in code review. This helps users understand what
default values are expected and why tests might fail on custom configurations.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
@claude

This comment was marked as off-topic.

We use uv for dependency management, not pip, so the pip cache setting
was causing warnings about missing cache folders.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
@claude

This comment was marked as outdated.

@dguido dguido merged commit 9268a78 into master Nov 29, 2025
23 checks passed
@dguido dguido deleted the e2e-vpn-connectivity-tests branch November 29, 2025 07:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add true end-to-end VPN connectivity tests using network namespaces

2 participants