Metahumans Halcyox

Building the docs

cd doc
.\make.bat html

Currently being refactored

Priority Refactor items

Restructuring the code flow to have one way dependency chain (rather than loop (Setting/Scene circular dependency))
Defining object-oriented code structure
Adding documentation

Non-priority Refactor items

Encapsulation of data privacy (__id etc.)
ZODB object saving persistence stuff
Multithreading

Next time:

Work on the batch generation of videos for two characters, and see how far you get.
Also, work on the front end with Alice.

Priority TODO:

* Selection backend for voice options

Set up simple server client app to get people to be able to access a server of talking to ai instance
Screen record video
Avatar creation/importing
Record + stream blendshapes

* Convert input script batch pipeline to work with our video generation scripts

Make VidGen Server -> rendering server distribute accross mult computers
Methods to upload videos to youtube -> google api integration

* Multi-character (n-entity) scene methods, just give prim paths

Talk to multiple AI simultaneously

Non-Priority TODO:

emotional speech
Optimization
change azure to streaming for lower latency
Connect your own camera to the output video with the AI
Unreal integration
SSML for voice style modification stuff

Web TODO:

Figure out dependencies, Docker stuff, versions
How to put on Amplify?
Should we use EC2 instances or what?
Website should have registration and billing
Token system for AI voice credits

Marketing TODO:

Affiliate marketing application to get other people to grow our software

LOCAL

mic input
speech to text
generate response
get .wav
run sh script

SERVER

Directory Structure

XRAgents
├── LICENSE
├── README.md
├── alphademo
│   ├── videoproxy
│   │   ├── src
│   │   │   ├── main.rs
│   │   │   └── signal.rs
│   │   ├── brws_sess
│   │   ├── Cargo.lock
│   │   └── Cargo.toml
...
├── deps/streaming_server
...
├── xragents
│   ├── anim.py
│   ├── audio.py
│   ├── cast.py
│   ├── nlp.py
│   ├── scenes.py
│   ├── scriptgen.py
│   ├── session.py
│   ├── types.py
│   └── utils.py
└── digital-humans.db

Example Post Requests

Create Chararacter

http://digital-humans.loca.lt/create-character characterName: Tom characterDescription: Tom is a wise old man who genuinely cares about people.

Create Session

http://digital-humans.loca.lt/create-session sessionName: newSession sessionDescription: The following is a conversation between you and Tom in the bookstore. characterIDList: 1

Get Response

http://digital-humans.loca.lt/get-response promptText: What should I do to heal from heartbreak? sessionID: 1 characterID: 1

Debug (ignore)

curl --location --request POST "https://eastus.tts.speech.microsoft.com/cognitiveservices/v1" --header "Ocp-Apim-Subscription-Key: bfc08e214f6c48cebcde668a433196d3" --header "Content-Type: application/ssml+xml" --header "X-Microsoft-OutputFormat: audio-16khz-128kbitrate-mono-mp3" --header "User-Agent: curl" --data-raw "<speak version='\''1.0'\'' xml:lang='\''en-US'\''><voice xml:lang='\''en-US'\'' xml:styledegree='\''2'\'' name='\''en-US-JennyNeural'\''>I hate you! You ruined my life!</voice></speak>" > C:\Users\phn431\Desktop\digital-humans-backend\wav\test.mp3

curl --location --request POST "https://eastus.api.cognitive.microsoft.com/sts/v1.0/issuetoken" --header "Ocp-Apim-Subscription-Key: bfc08e214f6c48cebcde668a433196d3" --header 'Content-Type: application/ssml+xml' --header 'X-Microsoft-OutputFormat: audio-16khz-128kbitrate-mono-mp3' --header 'User-Agent: curl' --data-raw '<speak version='\''1.0'\'' xml:lang='\''en-US'\''><voice xml:lang='\''en-US'\'' xml:style='\''angry'\'' xml:styledegree='\''2'\'' name='\''en-US-JennyNeural'\''>This is a test</voice></speak>' > C:\Users\phn431\Desktop\digital-humans-backend\wav\test.wav

curl --location --request POST "https://eastus.tts.speech.microsoft.com/cognitiveservices/v1" --header "Ocp-Apim-Subscription-Key: bfc08e214f6c48cebcde668a433196d3" --header 'Content-Type: application/ssml+xml' --header 'X-Microsoft-OutputFormat: audio-16khz-128kbitrate-mono-mp3' --header 'User-Agent: curl' --data-raw '<speak version='\''1.0'\'' xml:lang='\''en-US'\''><voice xml:lang='\''en-US'\'' xml:gender='\''Female'\'' name='\''en-US-JennyNeural'\''>my voice is my passport verify me</voice></speak>' > C:\Users\phn431\Desktop\digital-humans-backend\wav\test.wav

curl --location --request POST "https://eastus.tts.speech.microsoft.com/cognitiveservices/v1" --header "Ocp-Apim-Subscription-Key: bfc08e214f6c48cebcde668a433196d3" --header "Content-Type: application/ssml+xml" --header "X-Microsoft-OutputFormat: audio-16khz-128kbitrate-mono-mp3" --header "User-Agent: curl" --data-raw "<speak version='\''1.0'\'' xml:lang='\''en-US'\''><voice xml:lang='\''en-US'\'' xml:gender='\''Female'\'' name='\''en-US-JennyNeural'\''>my voice is my passport verify me</voice></speak>" > output.mp3

key="YourSubscriptionKey"
region="YourServiceRegion"

curl "https://eastus.tts.speech.microsoft.com/cognitiveservices/v1" `
--header "Ocp-Apim-Subscription-Key: bfc08e214f6c48cebcde668a433196d3" `
--header 'Content-Type: application/ssml+xml' `
--header 'X-Microsoft-OutputFormat: audio-16khz-128kbitrate-mono-mp3' `
--header 'User-Agent: curl' `
--data-raw '<speak version='\''1.0'\'' xml:lang='\''en-US'\''>
    <voice xml:lang='\''en-US'\'' xml:gender='\''Female'\'' name='\''en-US-JennyNeural'\''>
        my voice is my passport verify me
    </voice>
</speak>' > output.mp3

Name		Name	Last commit message	Last commit date
Latest commit History 80 Commits
.github/workflows		.github/workflows
.vscode		.vscode
deps/streaming_server		deps/streaming_server
doc		doc
scripts		scripts
xragents		xragents
.gitattributes		.gitattributes
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
digital-humans.db		digital-humans.db
main.py		main.py
requirements.txt		requirements.txt
server_setup.sh		server_setup.sh
tests.py		tests.py
two_guys.usd		two_guys.usd

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Metahumans Halcyox

Building the docs

Currently being refactored

Priority Refactor items

Non-priority Refactor items

Next time:

Priority TODO:

Non-Priority TODO:

Web TODO:

Marketing TODO:

LOCAL

SERVER

Directory Structure

Example Post Requests

Create Chararacter

Create Session

Get Response

Debug (ignore)

About

Releases

Packages

Contributors 3

Languages

License

Halcyox/XRAgents

Folders and files

Latest commit

History

Repository files navigation

Metahumans Halcyox

Building the docs

Currently being refactored

Priority Refactor items

Non-priority Refactor items

Next time:

Priority TODO:

Non-Priority TODO:

Web TODO:

Marketing TODO:

LOCAL

SERVER

Directory Structure

Example Post Requests

Create Chararacter

Create Session

Get Response

Debug (ignore)

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages