Self Print
Project Concept
Memory Keeper: An AI Photo-Journalist for Your Life Stories
Memory Keeper is an AI photo-journalist that turns a single photo into a natural, voice-to-voice conversation — and ultimately into a beautifully printable memory card.
Its purpose is simple: help people, especially families and elders, preserve the stories behind their photos without needing to write or use complicated tools.
How the AI Photo-Journalist Works
When someone uploads a photo, the AI photo-journalist:
1. Looks at the photo like a real reporter
- The image is stored in Vercel Blob
- OpenAI GPT-4o Vision grounds the agent in the scene — the people, objects, setting, and emotional tone.
2. Begins a live, back-and-forth conversation
- Using the OpenAI Realtime API (gpt-4o-realtime)
- The agent speaks with a warm, curious voice inspired by Brandon Stanton
- It asks one gentle follow-up question at a time, just like a real interviewer.
3. Helps the user tell the story behind the moment
It listens, interprets, and probes deeper conversationally:
Who’s in the photo?
What was happening?
Why does this moment matter?
How did it feel?
4. Turns the conversation into a keepsake
- The transcript becomes a poetic title + narrative
- Rendered as a Polaroid-style memory card you can print or share.
The entire experience is driven by the AI agent — the web layer simply supports it.
Project Goals
Emotional Goal
Capture stories people intend to write one day but never do — especially those from older relatives or meaningful family moments.
Product Goal
Deliver an end-to-end prototype that feels magical and real: smooth voice interaction, minimal friction, and a delightful final keepsake.
Agent Goal
Demonstrate a true agentic workflow:
- Vision-guided context
- Realtime conversational loop
- Autonomy in deciding when a memory is ready to be turned into a finished card
What We’re Improving Next
Agent Intelligence
- Refining the photo-journalist persona (tone, depth, pacing)
- Smarter follow-up questions using Vision + conversation history
- Faster turn-taking and lower latency in the realtime loop
End-to-End Flow
- Upload → Vercel Blob → Vision → realtime voice → narrative generation
- Transcript cleanup for consistent, poetic memory-card output
- Serving images via
/api/photo/[photoId]
Experience Design
- Scrapbook-inspired home screen
- Polaroid-camera metaphor for generating keepsakes
- Print-ready card layout (typography, framing, vintage touch)
- Smooth feedback: progress bar, speaking/listening indicators
Stability & Reliability
- Robust WebSocket reconnection
- Large image handling
- Mic permission issues across browsers
- Graceful fallback when Vision encounters errors
How Others Can Contribute
Conversation & UX
- Improve persona design and question flow
- Create flows for grandparents, couples, travel memories
- Better visual cues for “memory readiness”
Engineering
- Audio encoding + buffering optimization
- Optional storage backends (S3, Cloudinary, Supabase)
- Export pipelines: email, archive, batch printing
AI & Prompting
- Explore additional “modes” (kids, lighthearted, serious, multilingual)
- Improve story-generation consistency and emotional specificity
- Add multi-language support end-to-end
Design & Branding
- Logo, color system, typography refinements
- Additional card templates (wedding, travel, family, archival)
- Accessibility across interactions
Research
- Field tests with families and elders
- Trust-building and safety insights
- Privacy, consent, and long-term archival patterns
Tech Stack
- Next.js + TypeScript + Tailwind + shadcn/ui
- OpenAI Realtime API
- GPT-4o + GPT-4o Vision
- Vercel Blob Storage
If you love storytelling, memories, and human-centered AI, the AI photo-journalist has a lot of exciting surfaces to build on — from conversation design to realtime audio to visual memory-making.
Entry
Status: Submitted
Last saved: December 06 at 9:45 AM +08
Team Roster
Message board not available for this team yet.
Yu Gary Wang Team Lead RSVP Approved
AI Product Manager at Zoom