feat(app): automated Maestro E2E functional tests (#3857)#4733
feat(app): automated Maestro E2E functional tests (#3857)#4733Iansabia wants to merge 26 commits intoBasedHardware:mainfrom
Conversation
Configures Maestro for automated functional testing with core flows and device-required flow separation via tags.
Tests app launch, sign-in, name entry, language selection, permissions, speech profile skip, and home screen landing.
Tests conversation list rendering, scrolling, and list item visibility.
Tests opening a conversation, viewing transcript, and detail screen elements.
Tests creating, updating, and deleting conversations through the UI.
Tests memory list display, creation, and interaction.
Tests chat input, message sending, and AI response display.
Tests app store browsing, plugin installation, and management.
Tests settings screen navigation, preference toggles, and profile access.
Tests BLE device scanning, pairing, and connection status. Requires physical Omi device (tagged device_required).
Tests recording start, transcription indicator, and conversation creation from captured audio. Requires physical Omi device.
Runs all core E2E flows sequentially with pass/fail summary and optional HTML report generation.
Runs Maestro flows that require a physical Omi device connected.
Adds --e2e flag to run Maestro functional tests alongside existing unit/widget tests.
There was a problem hiding this comment.
Code Review
This pull request introduces a comprehensive suite of Maestro E2E tests, which is a great addition for ensuring app quality. The tests cover core functionality and are well-structured with tags for simulator vs. device-specific flows. My review focuses on improving the maintainability and robustness of the new test scripts. I've identified a few areas with code duplication in both the YAML flow definitions and the shell runner scripts. Addressing these will make the test suite more resilient and easier to manage as the app evolves.
…g flow Merges duplicate Maybe Later/later runFlow blocks into a single regex-based match per code review feedback.
…_all.sh Uses --exclude-tags=device_required to run all core flows dynamically, keeping config.yaml as the single source of truth for flow categorization.
…_device.sh Uses --include-tags=device_required to run device flows dynamically, keeping config.yaml as the single source of truth.
|
Hey @Iansabia, closing this for now — thanks for putting it together. The code and write-up look solid, but there's no real evidence that any of this was actually run and tested end-to-end. The test plan checkboxes are unchecked, and there are no screenshots, terminal output, videos, or logs showing it working on a real device or environment. This matters more than ever now that AI makes writing code easy — the code itself isn't the hard part anymore. What's valuable is proving it actually works: real test output, real screenshots, real demo. That's what gives reviewers confidence to merge. Feel free to reopen once you have real end-to-end evidence — run the tests, paste the output, show it working. We'd love to merge it then. |
|
Hey @Iansabia 👋 Thank you so much for taking the time to contribute to Omi! We truly appreciate you putting in the effort to submit this pull request. After careful review, we've decided not to merge this particular PR. Please don't take this personally — we genuinely try to merge as many contributions as possible, but sometimes we have to make tough calls based on:
Your contribution is still valuable to us, and we'd love to see you contribute again in the future! If you'd like feedback on how to improve this PR or want to discuss alternative approaches, please don't hesitate to reach out. Thank you for being part of the Omi community! 💜 |
Summary
core(runs on simulator) vsdevice_required(needs physical Omi hardware)run_all.sh,run_device.sh) with pass/fail reporting and optional HTML outputtest.shvia--e2eflagHow It Works
brew install maestrobash app/.maestro/scripts/run_all.shbash app/.maestro/scripts/run_device.shbash app/test.sh --e2eto run unit + widget + E2E tests togetherAfter ~1 hour you get a full report covering sign-in, conversation recording/transcription, CRUD operations, chat, and app management.
Test Flows
Test plan
flutter build ios --flavor dev --simulatorbash app/.maestro/scripts/run_all.shon simulatorbash app/.maestro/scripts/run_device.shwith Omi connectedCloses #3857