1×197×768
1×197×768
768
768
1×197×768
1×197×768
1×197×768
1×197×768
1×197×768
1×197×768
1×197×768
1×197×12×64
1×197×768
1×197×768
1×197×12×64
1×12×197×64
1×197×12×64
1×12×197×64
1×12×64×197
1×12×197×197
1×12×197×197
1×12×197×197
1×12×197×64
1×197×12×64
1×197×768
1×197×768
1×197×768
1×197×768
1×197×768
1×197×768
1×197×3072
1×197×3072
1×197×3072
1×197×3072
1×197×3072
1×197×3072
1×197×3072
1×197×3072
1×197×768
1×197×768
1×197×768
inputs
float32[1,197,768]
Identity
Identity_0
float32[768]
input
〈768〉
Identity
Identity_1
float32[768]
input
〈768〉
LayerNormalization
/layernorm_before/LayerNormalization
float32[768]
Scale
〈768〉
float32[768]
B
〈768〉
MatMul
/attention/attention/query/MatMul
float32[768,768]
B
〈768×768〉
Add
/attention/attention/query/Add
float32[768]
A
〈768〉
MatMul
/attention/attention/key/MatMul
float32[768,768]
B
〈768×768〉
Add
/attention/attention/key/Add
float32[768]
A
〈768〉
Reshape
/attention/attention/Reshape
int64[4]
shape
〈4〉
MatMul
/attention/attention/value/MatMul
float32[768,768]
B
〈768×768〉
Add
/attention/attention/value/Add
float32[768]
A
〈768〉
Reshape
/attention/attention/Reshape_1
int64[4]
shape
〈4〉
Transpose
/attention/attention/Transpose
Reshape
/attention/attention/Reshape_2
int64[4]
shape
〈4〉
Transpose
/attention/attention/Transpose_1
Transpose
/attention/attention/Transpose_2
MatMul
/attention/attention/MatMul
Div
/attention/attention/Div
float32
B
= 8
Softmax
/attention/attention/Softmax
MatMul
/attention/attention/MatMul_1
Transpose
/attention/attention/Transpose_3
Reshape
/attention/attention/Reshape_3
int64[3]
shape
〈3〉
MatMul
/attention/output/dense/MatMul
float32[768,768]
B
〈768×768〉
Add
/attention/output/dense/Add
float32[768]
A
〈768〉
Add
/Add
LayerNormalization
/layernorm_after/LayerNormalization
MatMul
/intermediate/dense/MatMul
float32[768,3072]
B
〈768×3072〉
Add
/intermediate/dense/Add
float32[3072]
A
〈3072〉
Div
/intermediate/intermediate_act_fn/Div
float32
B
= 1.41421353…
Erf
/intermediate/intermediate_act_fn/Erf
Add
/intermediate/intermediate_act_fn/Add
float32
B
= 1
Mul
/intermediate/intermediate_act_fn/Mul
Mul
/intermediate/intermediate_act_fn/Mul_1
float32
B
= 0.5
MatMul
/output/dense/MatMul
float32[3072,768]
B
〈3072×768〉
Add
/output/dense/Add
float32[768]
A
〈768〉
Add
/output/Add
hidden_states
float32[1,197,768]
×
torch_jit
❮
Version
{version}
Copyright ©
Lutz Roeder
Open Model…
.
.
.
OK
≡