Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to use ncnn inference for a 4-dimensional speech model #1

Open
SEMLLYCAT opened this issue Sep 2, 2023 · 2 comments
Open

How to use ncnn inference for a 4-dimensional speech model #1

SEMLLYCAT opened this issue Sep 2, 2023 · 2 comments

Comments

@SEMLLYCAT
Copy link

Hello, my input is
input (1, 257, 1, 2)
state_h (1, 31, 32)
state_c (1, 32, 32)
May I ask how I should create the correct Mat data type for ncnn inference?

@magicse
Copy link
Owner

magicse commented Sep 3, 2023

HiFi-GAN accept 1d array where input channels = num_mels and = 80 (for ncnn it present as height) and width is time stamps of mel -spectrogramm.

Input Mel spectrogram paramters:
n_fft = 1024
num_mels = 80
sampling_rate = 22050
hop_size = 256
win_size = 1024
fmin = 0
fmax = 8000

@SEMLLYCAT
Copy link
Author

Thank you for your reply. I have a question, please check it. eg:
import torch as nn
class TestLstm(nn.Module):
def init(self, input_size, hidden_size, rnn_type='LSTM', droupt=0, bidirectional=False):
super(TestLstm, self).init()
self.rnn = nn.LSTM(input_size, hidden_size, 1, dropout=dropout, batch_first=True, bidirectional=False)

def forword(self, x, state_list):
output, hidden_state = self.rnn(x, state_list)
return output, hidden_state

if name == "main":
model = TestLstm(input_size=64, hidden_size=64, rnn_type='LSTM', dropout=0, bidirectional=False)
model.eval()
x = torch.randn(1, 64, 31, 1)
state_h0 = torch.randn(1, 31, 64, requires_grad=False)
state_c0 = torch.randn(1, 31, 64, requires_grad=False)
state_in = [(state_h0, state_c0)]
out, state_out = model(x, state_in[0])

onnx2ncnn:
.param
The lstm layer in the.param file is similar:
LSTM /lstm_t1/LSTM 3 3 x state_h0 state_c0 /lstm_t1/Transpose_1_out_0 state_out_h0 state_out_c0 0=32 1=8192 2=0

problem:The ncnn inference result is inconsistent with the onnx result, and I cannot align all the results. I don't know whether I used it correctly, and could you please help verify it, thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants