Commit 7b94bd9
[common] Added support of FP4 data type (NVIDIA#1779)
* Added support of FP4 data type
Signed-off-by: Oleg Goncharov <ogoncharov@nvidia.com>
* Refactoring to BitsNum in progress
Signed-off-by: Oleg Goncharov <ogoncharov@nvidia.com>
* Fixed compilation errors. All C++ tests passed
Signed-off-by: Oleg Goncharov <ogoncharov@nvidia.com>
* Fixed a typo
Signed-off-by: Oleg Goncharov <ogoncharov@nvidia.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Added FP4 guard to TMA tensor descriptor data type
Signed-off-by: Oleg Goncharov <ogoncharov@nvidia.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Fixed errors in JAX C++ extensions
Signed-off-by: Oleg Goncharov <ogoncharov@nvidia.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Removed dummy NVFP4 C++ test file
Signed-off-by: Oleg Goncharov <ogoncharov@nvidia.com>
* Make pytorch changes
Signed-off-by: Kirthi Shankar Sivamani <ksivamani@nvidia.com>
* Refactored the code per the review notes. Fixed JAX build error.
Signed-off-by: Oleg Goncharov <ogoncharov@nvidia.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Removed unnecessary static casts
Signed-off-by: Oleg Goncharov <ogoncharov@nvidia.com>
* Typo fix
Signed-off-by: Oleg Goncharov <64355998+Oleg-Goncharov@users.noreply.github.com>
* Pass correct num bits to create_2D_tensor_map; fixes CI
Signed-off-by: Kirthi Shankar Sivamani <ksivamani@nvidia.com>
* inline funcs
Signed-off-by: Kirthi Shankar Sivamani <ksivamani@nvidia.com>
---------
Signed-off-by: Oleg Goncharov <ogoncharov@nvidia.com>
Signed-off-by: Kirthi Shankar Sivamani <ksivamani@nvidia.com>
Signed-off-by: Oleg Goncharov <64355998+Oleg-Goncharov@users.noreply.github.com>
Co-authored-by: Kirthi Shankar Sivamani <ksivamani@nvidia.com>1 parent e963e4a commit 7b94bd9
File tree
23 files changed
+391
-169
lines changed- tests/cpp
- operator
- transformer_engine
- common
- comm_gemm_overlap
- fused_attn
- include/transformer_engine
- normalization
- transpose
- util
- pytorch/csrc
- extensions
23 files changed
+391
-169
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
67 | 67 | | |
68 | 68 | | |
69 | 69 | | |
70 | | - | |
| 70 | + | |
| 71 | + | |
71 | 72 | | |
72 | 73 | | |
73 | 74 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
45 | 45 | | |
46 | 46 | | |
47 | 47 | | |
48 | | - | |
| 48 | + | |
49 | 49 | | |
50 | 50 | | |
51 | 51 | | |
| |||
62 | 62 | | |
63 | 63 | | |
64 | 64 | | |
65 | | - | |
| 65 | + | |
| 66 | + | |
66 | 67 | | |
67 | 68 | | |
68 | 69 | | |
| |||
109 | 110 | | |
110 | 111 | | |
111 | 112 | | |
112 | | - | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
113 | 117 | | |
114 | 118 | | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
115 | 123 | | |
116 | 124 | | |
117 | 125 | | |
| |||
122 | 130 | | |
123 | 131 | | |
124 | 132 | | |
125 | | - | |
| 133 | + | |
126 | 134 | | |
127 | 135 | | |
128 | 136 | | |
| |||
152 | 160 | | |
153 | 161 | | |
154 | 162 | | |
155 | | - | |
156 | | - | |
| 163 | + | |
| 164 | + | |
157 | 165 | | |
158 | 166 | | |
159 | 167 | | |
| |||
179 | 187 | | |
180 | 188 | | |
181 | 189 | | |
182 | | - | |
183 | | - | |
| 190 | + | |
| 191 | + | |
184 | 192 | | |
185 | 193 | | |
186 | 194 | | |
| |||
205 | 213 | | |
206 | 214 | | |
207 | 215 | | |
208 | | - | |
209 | | - | |
| 216 | + | |
| 217 | + | |
210 | 218 | | |
211 | 219 | | |
212 | 220 | | |
| |||
222 | 230 | | |
223 | 231 | | |
224 | 232 | | |
225 | | - | |
226 | | - | |
| 233 | + | |
227 | 234 | | |
228 | 235 | | |
229 | 236 | | |
| |||
305 | 312 | | |
306 | 313 | | |
307 | 314 | | |
308 | | - | |
309 | | - | |
| 315 | + | |
| 316 | + | |
310 | 317 | | |
311 | 318 | | |
312 | 319 | | |
| |||
331 | 338 | | |
332 | 339 | | |
333 | 340 | | |
334 | | - | |
| 341 | + | |
335 | 342 | | |
336 | 343 | | |
337 | 344 | | |
| |||
360 | 367 | | |
361 | 368 | | |
362 | 369 | | |
363 | | - | |
| 370 | + | |
364 | 371 | | |
365 | 372 | | |
366 | 373 | | |
367 | 374 | | |
368 | 375 | | |
369 | 376 | | |
370 | | - | |
| 377 | + | |
371 | 378 | | |
372 | 379 | | |
373 | 380 | | |
| |||
378 | 385 | | |
379 | 386 | | |
380 | 387 | | |
381 | | - | |
| 388 | + | |
382 | 389 | | |
383 | | - | |
384 | | - | |
| 390 | + | |
| 391 | + | |
385 | 392 | | |
386 | 393 | | |
387 | | - | |
388 | | - | |
| 394 | + | |
| 395 | + | |
389 | 396 | | |
390 | 397 | | |
391 | 398 | | |
392 | 399 | | |
393 | | - | |
394 | | - | |
| 400 | + | |
395 | 401 | | |
396 | | - | |
397 | | - | |
| 402 | + | |
398 | 403 | | |
399 | 404 | | |
400 | 405 | | |
401 | 406 | | |
402 | | - | |
| 407 | + | |
403 | 408 | | |
404 | 409 | | |
405 | 410 | | |
406 | 411 | | |
407 | 412 | | |
408 | | - | |
| 413 | + | |
409 | 414 | | |
410 | 415 | | |
411 | 416 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
10 | 10 | | |
11 | 11 | | |
12 | 12 | | |
| 13 | + | |
| 14 | + | |
13 | 15 | | |
14 | 16 | | |
15 | 17 | | |
16 | 18 | | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
17 | 22 | | |
18 | 23 | | |
19 | 24 | | |
| |||
55 | 60 | | |
56 | 61 | | |
57 | 62 | | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
58 | 66 | | |
59 | 67 | | |
60 | | - | |
61 | | - | |
62 | | - | |
63 | | - | |
64 | | - | |
65 | | - | |
66 | | - | |
67 | | - | |
68 | | - | |
69 | | - | |
70 | | - | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
71 | 89 | | |
72 | 90 | | |
73 | 91 | | |
| |||
94 | 112 | | |
95 | 113 | | |
96 | 114 | | |
97 | | - | |
| 115 | + | |
98 | 116 | | |
99 | 117 | | |
100 | 118 | | |
| |||
416 | 434 | | |
417 | 435 | | |
418 | 436 | | |
419 | | - | |
| 437 | + | |
420 | 438 | | |
421 | 439 | | |
| 440 | + | |
422 | 441 | | |
423 | 442 | | |
424 | 443 | | |
| |||
464 | 483 | | |
465 | 484 | | |
466 | 485 | | |
| 486 | + | |
| 487 | + | |
| 488 | + | |
| 489 | + | |
| 490 | + | |
| 491 | + | |
| 492 | + | |
| 493 | + | |
| 494 | + | |
| 495 | + | |
467 | 496 | | |
468 | 497 | | |
469 | 498 | | |
| |||
515 | 544 | | |
516 | 545 | | |
517 | 546 | | |
| 547 | + | |
| 548 | + | |
| 549 | + | |
| 550 | + | |
| 551 | + | |
| 552 | + | |
| 553 | + | |
518 | 554 | | |
519 | | - | |
| 555 | + | |
| 556 | + | |
520 | 557 | | |
521 | 558 | | |
522 | 559 | | |
| |||
535 | 572 | | |
536 | 573 | | |
537 | 574 | | |
538 | | - | |
| 575 | + | |
| 576 | + | |
| 577 | + | |
| 578 | + | |
| 579 | + | |
| 580 | + | |
| 581 | + | |
| 582 | + | |
| 583 | + | |
539 | 584 | | |
540 | 585 | | |
541 | 586 | | |
| |||
560 | 605 | | |
561 | 606 | | |
562 | 607 | | |
563 | | - | |
| 608 | + | |
564 | 609 | | |
0 commit comments