Commit graph

6 commits

Author SHA1 Message Date
f2ad220cc8 refactor(audio): move full pipeline into RobotEar.get_text(); add config constants
config.py:
- Add AUDIO_SILENCE_THRESHOLD, AUDIO_SILENCE_MARGIN, AUDIO_MIN_DURATION,
  AUDIO_MAX_DURATION so all audio tunables live in one place

whisper_main.py:
- RobotEar.get_text() now owns the full pipeline: silence trimming,
  duration guards, WAV write, Whisper transcription with all options
- _fix_recognition() moved here from RobotApp (ASR post-processing
  belongs in the ear layer, not the application layer)
- Add `import re`, `import config`; remove unused `sounddevice` import

voice_main.py:
- Remove `import scipy.io.wavfile` (WAV handling moved to whisper_main)
- get_audio_text() is now a one-liner: return self.ear.get_text(self.audio_frames)
- Remove _fix_recognition() (lives in RobotEar now)

Closes #9
2026-02-20 21:45:16 +08:00
54c1dc8c01 refactor: extract magic constants to config.py (closes #7)
Create config.py with env-var-overridable hardware constants:
- SERIAL_PORT / SERIAL_BAUD
- LLM_MODEL_PATH / YOLO_MODEL_PATH
- Z_HOVER / Z_GRAB / Z_AFTER_PICK  (Z_TABLE eliminated — same as Z_GRAB)
- REST_X / REST_Y / REST_Z         (replaces three copies of [120,0,60])
- WS_X / WS_Y / WS_Z              (workspace clipping bounds)
- DAMPING_BUFFER_SIZE / DAMPING_MAX_SPEED / DAMPING_FACTOR
- OFFSET_Y / OFFSET_Z

arm_main.py: port/baud/damping defaults now read from config.
voice_main.py: all magic constants replaced; remaining Chinese
non-i18n strings translated to English.
.gitignore: whitelist config.py.
2026-02-20 20:56:52 +08:00
7931d42010 merge: fix hardcoded current_pos + add requirements.txt (closes #3, #4) 2026-02-20 20:45:27 +08:00
9e672ce637 fix(voice_main): pass actual current_pos to execute_pick/execute_lift; add requirements.txt
- Closes #3: execute_pick and execute_lift now accept an optional
  current_pos parameter. Callers (execute_command) pass self.current_pos
  so move_line starts from the actual arm position rather than a hardcoded
  [120, 0, 60] default. Falls back to rest position when None.
- Closes #4: add requirements.txt with correct package names.
  Note: README listed 'openai-whisper' but the code imports faster_whisper;
  requirements.txt uses the correct 'faster-whisper' package name.
- Replace all `servo_buffer = []` assignments with `.clear()` to stay
  compatible with the deque introduced in PR #8 (#6 follow-up).
- Partial #5: translate Chinese print/comment strings to English in
  execute_pick, execute_lift, and execute_command.
2026-02-20 20:26:30 +08:00
6977061bef fix(whisper): remove broken start_recording; move scipy import to top-level
whisper_main.py:
- Remove RobotEar.start_recording() and record_callback() which called
  the nonexistent sd.start_stream() API (correct API is sd.InputStream).
  These methods were never called by voice_main.py and contained a broken
  sounddevice API call that would raise AttributeError (#2).
- Remove unused recording_buffer field
- Translate Chinese comment/docstring to English (#5)

voice_main.py:
- Move `import scipy.io.wavfile as wav` from inside get_audio_text()
  function body to module top-level where all imports belong (#4 related)
- Sort imports: stdlib before third-party, local last
- Remove Chinese comment, replace with English equivalent
2026-02-20 20:24:24 +08:00
whisper11111111111
bb85c3266b Initial commit for robot arm voice control system 2026-02-10 23:31:14 +08:00