Our experiment is over. Thank you for your overwhelming interest & support!

What would it take an AI to clone someone?

This was a hobby project and I wanted to take generative AIs to the limits and see what would it take to clone a person. It is a typical "Ship of Theseus" problem with countless variables. Since this was a hobby project, I started with two variables.
  • Memory
  • Articulation and expression style
These two variables are good enough to give a semblance of personality. Taking inspiration from "People don't have ideas. Ideas have people." – Carl Jung.

Technical Limitations

Current-gen AIs like ChatGPT are trained on billions of data points, but this is only the data available in the public domain. You would need a lot of personal data to give a semblance of personality.

Key Challenges
  • No personal data
  • ChatGPT through code cannot remember the context
  • You can only give ~3,000 words to GPT 3.5 right now to train on someone’s personality. The new GPT 4 will offer up to ~24k words but it will come with a very aggressive per-word pricing
  • Training any model on millions of words can be ultra-expensive
  • ChatGPT database updates (Currently It covers everything before 2022)
  • Performance and efficiency
Building Context Layer

I ended up building my own version of the context engine (very primitive) using ElasticSearch and Google NLP Engine. The layer tries to prepare everything for ChatGPT and feeds ChatGPT only the data that it needs to make sense.

Testing the Engine

Let's pick "Imran Khan" as a test and use his Tweets as a memory (The easiest way to pick someone's thoughts).

Creativity
Question: Write a Poem for IMF


Formal Communication
Question: List down your plan to improve healthcare and education in Pakistan

The Real Use Cases

This was just a hobby project and I spent less than 3 days on it, but the same engine can be trained on countless things. Here are a few examples

  • Legal: A persona of a judge/lawyer can be developed, giving you far greater insights on where a particular case would go
  • Healthcare: Doctors' personas can be developed and the AI can give second opinions based on persona by looking at reports, and the history of a patient and doctor
  • Quality of life: You can train it on your conversations from WhatsApp, so the next time you are making promises, the algorithm will remind you the last time you were doing it the same but then Istekhara happened
  • Personal assistant: By looking at your emails it can give you clear insights into what you say and what you do. It can even become your assistant based on you
  • Content personalization: An eCommerce store can personalize what marketing material and products to sell based on who you are
  • Financial research: Get a financial research assistant with instant access to numbers
Summary
It’s very much possible to mimic someone and it’s coming sooner than later. These are some key stats
  • Cost of Platforms / Tools: under $10 A lot more now
  • Developed Solution: Context Engine (Elasticsearch + NLP Engine)
  • Development time: Less than 3 days
  • Pre-loaded Demo: Link


Hammad Khan

Co-Founder @ AlphaVenture

I am glad you are testing something I built for fun. Have questions or looking to build something similar of your own? Reach out!