MCPcopy
hub / github.com/FACEGOOD/FACEGOOD-Audio2Face

github.com/FACEGOOD/FACEGOOD-Audio2Face @main sqlite

repository ↗ · DeepWiki ↗
80 symbols 229 edges 21 files 14 documented · 18%
README

Audio2Face

Notice

The Test part and The UE project for xiaomei created by FACEGOOD is not available for commercial use.they are for testing purposes only.

Description


ue

the case video video

https://user-images.githubusercontent.com/11623487/217445125-41ccc812-d93c-4789-b93a-73b1a008d821.mp4

high resolution

We create a project that transforms audio to blendshape weights,and drives the digital human,xiaomei,in UE project.

Base Module


figure1

figure2

The framework we used contains three parts. In Formant network step, we perform fixed-function analysis of the input audio clip. In the articulation network, we concatenate an emotional state vector to the output of each convolution layer after the ReLU activation. The fully-connected layers at the end expand the 256+E abstract features to blendshape weights.

Usage


pipeline

this pipeline shows how we use FACEGOOD Audio2Face.

Test video 1 Test video 2 Ryan Yun from columbia.edu

Prepare data

  • step1: record voice and video ,and create animation from video in maya. note: the voice must contain vowel ,exaggerated talking and normal talking.Dialogue covers as many pronunciations as possible.
  • step2: we deal the voice with LPC,to split the voice into segment frames corresponding to the animation frames in maya.

Input data

Use ExportBsWeights.py to export weights file from Maya.Then we can get BS_name.npy and BS_value.npy .

Use step1_LPC.py to deal with wav file to get lpc_*.npy . Preprocess the wav to 2d data.

train

We recommand that uses FACEGOOD avatary to produces trainning data.its fast and accurate. http://www.avatary.com

All train code in folder code/train.

the data for train is stored in dataSet In windows, you can simply download the dataset in the following link, then use the script to train the model.

```powershell/cmd ./train.bat

or you can use the following commands to generate data, train and test the model.

[Note: you need to modify the path in every files to your own path.]

```shell
cd code/train

python step1_LPC.py // deal with wav file to get lpc_*.npy

python step3_concat_select_split.py // generate train data and label

python step4_train.py // train model

python step5_inference.py  // inference model

test

In folder /test,we supply a test application named AiSpeech.
wo provide a pretrained model,zsmeif.pb
In floder /example/ueExample, we provide a packaged ue project that contains a digit human created by FACEGOOD can drived by /AiSpeech/zsmeif.py.

you can follow the steps below to use it: 1. make sure you connect the microphone to computer. 2. run the script in terminal. > python zsmeif.py 3. when the terminal show the message "run main", please run FaceGoodLiveLink.exe which is placed in /example/ueExample/ folder. 4. click and hold on the left mouse button on the screen in UE project, then you can talk with the AI model and wait for the voice and animation response.

Dependences

tersorflow-gpu 2.6
cudatoolkit 11.3.1 cudnn 8.2.1 scipy 1.7.1

python-libs: pyaudio requests websocket websocket-client

note: test can run with cpu.

Data


dataSet dims

dataSet train_data      train_label_var val_data        val_label_var
1   (24370, 32, 64, 1)  (24370, 38) (1000, 32, 64, 1)   (1000, 38)
2   (27271, 32, 64, 1)  (27271, 38) (1000, 32, 64, 1)   (1000, 38)
3   (12994, 32, 64, 1)  (12994, 38) (1000, 32, 64, 1)   (1000, 38)
4   (12994, 32, 64, 1)  (12994, 37) (1000, 32, 64, 1)   (1000, 37)
5   (22075, 32, 64, 1)  (22075, 37) (1000, 32, 64, 1)   (1000, 37)
6   (24976, 32, 64, 1)  (24976, 37) (1000, 32, 64, 1)   (1000, 37)
7   (22075, 32, 64, 1)  (22075, 37) (1000, 32, 64, 1)   (1000, 37)
8   (24976, 32, 64, 1)  (24976, 37) (1000, 32, 64, 1)   (1000, 37)
9   (17368, 32, 64, 1)  (17368, 37) (1000, 32, 64, 1)   (1000, 37)
10  (20269, 32, 64, 1)  (20269, 37) (1000, 32, 64, 1)   (1000, 37)
11  (25370, 32, 64, 1)  (25370, 37) (1000, 32, 64, 1)   (1000, 37)
12  (28271, 32, 64, 1)  (28271, 37) (1000, 32, 64, 1)   (1000, 37)
13  (20019, 32, 64, 1)  (20019, 37) (1000, 32, 64, 1)   (1000, 37)
14  (22920, 32, 64, 1)  (22920, 37) (1000, 32, 64, 1)   (1000, 37)
15  (22920, 32, 64, 1)  (22920, 37) (1000, 32, 64, 1)   (1000, 37)
16  (25485, 32, 64, 1)  (25485, 37) (1000, 32, 64, 1)   (1000, 37)

we only used dataSet 4 to dataSet 16

The testing data, Maya model, and ue4 test project can be downloaded from the link below.

the index of dataset 4-16 is for trainning.

data_all code : n6ty

GoogleDrive

Add maya model in maya-model folder

Update

uploaded the LPC source into code folder.

Reference


Audio-Driven Facial Animation by Joint End-to-End Learning of Pose and Emotion

Contact

Wechat: FACEGOOD_CHINA
Email:jelo.wang@gmail.com Discord: https://discord.gg/V46y6uTdw8

License

Audio2Face Core is released under the terms of the MIT license.See COPYING for more information or see https://opensource.org/licenses/MIT.

Core symbols most depended-on inside this repo

run
called by 9
code/train/step5_inference.py
send_ws_asr_data
called by 6
code/test/AiSpeech/lib/aispeech/api_websocket.py
update_token
called by 5
code/test/AiSpeech/lib/aispeech/api_aispeech.py
ping
called by 3
code/test/AiSpeech/lib/aispeech/api_websocket.py
acquire
called by 3
code/test/AiSpeech/lib/socket/ue4_socket.py
conv2d_layer
called by 3
code/train/model_paper.py
play_audio_data_thread
called by 2
code/test/AiSpeech/lib/audio/api_audio.py
tts
called by 2
code/test/AiSpeech/lib/aispeech/api_aispeech.py

Shape

Method 49
Function 16
Class 15

Languages

Python100%

Modules by API surface

code/train/model_paper.py14 symbols
code/test/AiSpeech/lib/audio/api_audio.py12 symbols
code/test/AiSpeech/zsmeif.py11 symbols
code/test/AiSpeech/lib/aispeech/api_websocket.py10 symbols
code/test/AiSpeech/lib/socket/ue4_socket.py7 symbols
code/test/AiSpeech/lib/aispeech/api_aispeech.py6 symbols
code/train/step5_inference.py5 symbols
code/test/AiSpeech/lib/tensorflow/input_lpc_output_weight.py5 symbols
code/train/tfv1/model_paper.py2 symbols
code/train/step4_train.py2 symbols
code/test/AiSpeech/lib/tensorflow/input_wavdata_output_lpc.py2 symbols
code/train/tfv1/step4_train.py1 symbols

For agents

$ claude mcp add FACEGOOD-Audio2Face \
  -- python -m otcore.mcp_server <graph>

⬇ download graph artifact