-
Notifications
You must be signed in to change notification settings - Fork 3
/
Copy pathindex.html
1522 lines (1401 loc) · 59.5 KB
/
index.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
<!doctype html>
<html>
<head>
<meta charset="utf-8"/>
<title>FPBench</title>
<meta name="viewport" content="width=device-width, initial-scale=1" />
<link rel="stylesheet" type="text/css" href="fpbench.css">
<style>
#info {
padding-bottom: 1.25em;
margin-bottom: 1.5em;
border-bottom: 1px solid gray;
}
details { clear: both; margin-bottom: 1.5em; }
summary { list-style-type: none; }
time { float: right; }
.title { color: #198C0E; }
details:has(.abstract) .title { cursor: pointer; }
.speaker { font-style: italic; font-size: 90%; }
.abstract { font-size: 80%; margin-bottom: 2em; }
</style>
</head>
<body>
<header>
<a href='index.html'>
<img src='img/logo.png' height='150' alt='FPBench Logo' />
<h1>FPBench</h1>
</a>
<p>Numerics community meetings and resources</p>
<ul>
<li><a href="index.html">Events</a></li>
<li><a href="education.html">Education</a></li>
<li><a href="community.html">Tools</a></li>
<li><a href="https://groups.google.com/a/fpbench.org/g/fpbench">Mailing List</a></li>
</ul>
</header>
<div class="pitch">
<a href="https://groups.google.com/a/fpbench.org/g/fpbench">Join us</a> monthly to talk numerics.
</div>
<p id='info'>
FPBench meets for an hour, on Zoom, on the first Thursday of every month at 9am Pacific.<abbr title="Except that we skip months for the annual workshop and for the winter holidays."><sup>†</sup></abbr> You can <a href="https://forms.gle/H5aodraRcgcJFq6W8">submit a proposal</a> to present.
</p>
<main>
<!--
<details>
<summary>
<time>TIME</time>
<div class="title">
TITLE, use sentence not title case
</div>
<div class="speaker">
TODO, University of TODO
</div>
</summary>
<div class="abstract">
TODO
</div>
</details>
-->
<details>
<summary>
<time>
Feb 6, 2025
</time>
<div class="title">
LP meets PL: efficient linear programming using a geometric lens
</div>
<div class="speaker">
Mridul Aanjaneya, Rutgers University
</div>
</summary>
<div class="abstract">
Linear programs (LP) arise in many domains, such as robotics,
databases, machine learning, computer graphics, etc. LP solvers are
also widely used in the PL community for analyzing vulnerabilities in C
source code, designing correctly rounded Math libraries, repairing deep
neural networks, Presburger arithmetic for polyhedral compilation, etc.
Linear programs with billions of constraints are beyond the reach of
modern LP solvers. However, for many practical applications, the linear
program is "low-dimensional", i.e., it has many constraints but only a
few unknown variables. In this talk, we'll first discuss how random
sampling can be used for provably solving (in terms of expected
iterations) feasible low-dimensional linear programs with billions of
constraints. Next, we'll discuss how this approach requires
enhancements for infeasible linear programs, where a small number of
conflicting constraints make the linear program have no solution. We'll
show that a geometric perspective can be used for efficiently
eliminating the conflicting constraints and computing a solution that
satisfies the maximum number of constraints. We'll present applications
of our LP solver for generating correctly rounded Math libraries for
floating-point variants. Time permitting, we'll also discuss
applications in interpretable machine learning.
</div>
</details>
<details>
<summary>
<time>
Dec 5, 2024
</time>
<div class="title">
Aster: sound mixed fixed-point quantizer for neural networks
</div>
<div class="speaker">
Debasmita Lohar, Karlsruhe Institute of Technology
</div>
</summary>
<div class="abstract">
<p>
Neural networks are increasingly becoming integral to safety-critical
applications, such as controllers in embedded systems. While formal
safety verification focuses on idealized real-valued networks, practical
applications require quantization to finite precision, inevitably
introducing roundoff errors. Manually optimizing precision, especially
for fixed-point implementation, while ensuring safety is complex and
time-consuming.
</p>
<p>
In this talk, I will introduce Aster, the sound, fully automated,
mixed-precision, fixed-point quantizer for deep feed-forward neural
networks. Aster reduces the quantization problem to a mixed-integer
linear programming (MILP) problem, thus efficiently determining minimal
precision to guarantee predefined error bounds. Our evaluations show
that Aster's optimized code reduces machine cycles when compiled to an
FPGA with a commercial HLS compiler compared to (sound)
state-of-the-art. Furthermore, Aster handles significantly more
benchmarks faster, especially for larger networks with thousands of
parameters.
</p>
</div>
</details>
<details>
<summary>
<time>Nov 7, 2024</time>
<div class="title">
Exploring FPGA designs for MX and beyond
</div>
<div class="speaker">
Ebby Samson, Imperial College London
</div>
</summary>
<div class="abstract">
A number of companies recently worked together to release the new Open
Compute Project MX standard for low-precision computation, aimed at
efficient neural network implementation. In our work, we describe and
evaluate the first open-source FPGA implementation of the arithmetic
defined in the standard. Our designs fully support all the standard's
concrete formats for conversion into and out of MX formats and for the
standard-defined arithmetic operations, as well as arbitrary
fixed-point and floating-point formats. Certain elements of the
standard are left as implementation-defined, and we present the first
concrete FPGA-inspired choices for these elements. Our library of
optimized hardware components is available open source, alongside our
open-source Pytorch library for quantization into the new standard,
integrated with the Brevitas library so that the community can develop
novel neural network designs quantized with MX formats in mind. Our
testing shows that MX is very effective for formats such as INT5 or FP6
which are not natively supported on GPUs. This gives FPGAs an advantage
as they have the flexibility to implement a custom datapath and take
advantage of the smaller area footprints offered by these formats.
</div>
</details>
<details>
<summary>
<time>Oct 3, 2024</time>
<div class="title">
Geometric predicates for unconditionally robust elastodynamics simulation
</div>
<div class="speaker">
Daniele Panozzo, Courant Institute of Mathematical Sciences in New York University
</div>
</summary>
<div class="abstract">
The numerical solution of partial differential equations (PDE) is
ubiquitously used for physical simulation in scientific computing and
engineering. Ideally, a PDE solver should be opaque: the user provides
as input the domain boundary, boundary conditions, and the governing
equations, and the code returns an evaluator that can compute the
value of the solution at any point of the input domain. This is
surprisingly far from being the case for all existing open-source or
commercial software, despite the research efforts in this direction
and the large academic and industrial interest. To a large extent,
this is due to lack of robustness in geometric algorithms used to
create the discretization, detect collisions, and evaluate element
validity.
I will present the incremental potential contact simulation paradigm,
which provides strong robustness guarantees in simulation codes,
ensuring, for the first time, validity of the trajectories accounting
for floating point rounding errors over an entire elastodynamic
simulation with contact. A core part of this approach is the use of a
conservative line-search to check for collisions between geometric
primitives and for ensuring validity of the deforming elements over
linear trajectories.
I will discuss both problems in depth, showing that SOTA approaches
favor numerical efficiency but are unfortunately not robust to
floating point rounding, leading to major failures in simulation. I
will then present an alternative approach based on judiciously using
rational and interval types to ensure provable correctness, while
keeping a running time comparable with non-conservative methods.
To conclude, I will discuss a set of applications enabled by this
approach in microscopy and biomechanics, including traction force
estimation on a live zebrafish and efficient modeling and simulation
of fibrous materials.
</div>
</details>
<details>
<summary>
<time>Sep 5, 2024</time>
<div class="title">
Designing Type Systems for Rounding Error Analysis
</div>
<div class="speaker">
Ariel Kellison, Cornell University
</div>
</summary>
<div class="abstract">
A central challenge in scientific computing is managing the tradeoff between accuracy
and performance. Scientific applications often require high precision to produce
meaningful results, but achieving this precision can reduce computational speed
and efficiency. For example, using higher precision arithmetic can minimize rounding
errors and improve the reliability of results, but it typically demands more processing
power and memory, which negatively affects performance. As a result, scientific software
developers must carefully balance the need for accuracy with the need for acceptable
performance.
In recent years, programming languages like Rust have demonstrated how carefully
designed type systems can help developers write high-performance code while ensuring
critical properties, such as memory safety. In this talk, we will present our work on
designing type systems that provide guarantees about the accuracy of floating-point
computations. By introducing types that can quantitatively represent the error bounds
of floating-point computations, we can create statically typed programming languages
that alert developers to potentially significant inaccuracies early in the development
process.
</div>
</details>
<details>
<summary>
<time>
Jul 11, 2024
</time>
<div class="title">
<a href="talks/fptalks24.html">FPTalks 2024 annual workshop</a>
</div>
<div class="speaker">
12 speakers at the cutting edge of numerics research
</div>
</summary>
</details>
<details>
<summary>
<time>
Jun 6, 2024
</time>
<div class="title">
On the precision loss in approximate homomorphic encryption
</div>
<div class="speaker">
Rachel Player, Royal Holloway University of London
</div>
</summary>
<div class="abstract">
Since its introduction at Asiacrypt 2017, the CKKS approximate homomorphic encryption
scheme has become one of the most widely used and implemented homomorphic encryption
schemes. Due to the approximate nature of the scheme, application developers using
CKKS must ensure that the evaluation output is within a tolerable error of the corresponding
plaintext computation. Choosing appropriate parameters requires a good understanding of
how the noise will grow through the computation. A strong understanding of the noise
growth is also necessary to limit the performance impact of mitigations to the attacks
on CKKS presented by Li and Micciancio (Eurocrypt 2021). In this work we present a
comprehensive noise analysis of CKKS, that considers noise coming both from the encoding and
homomorphic operations. Our main contribution is the first average-case analysis for CKKS
noise, and we also introduce refinements to prior worst-case noise analyses. We develop
noise heuristics both for the original CKKS scheme and the RNS variant presented
at SAC 2018. We then evaluate these heuristics by comparing the predicted noise growth
with experiments in the HEAAN and FullRNS-HEAAN libraries, and by comparing with a
worst-case noise analysis as done in prior work. Our findings show mixed results: while
our new analyses lead to heuristic estimates that more closely model the observed noise
growth than prior approaches, the new heuristics sometimes slightly underestimate the
observed noise growth. This evidences the need for implementation-specific noise analyses
for CKKS, which recent work has shown to be effective for implementations of similar schemes.
This is joint work with Anamaria Costache, Benjamin R. Curtis, Erin Hales, Sean Murphy,
and Tabitha Ogilvie.
</div>
</details>
<details>
<summary>
<time>
May 2, 2024
</time>
<div class="title">
ReLU hull approximation
</div>
<div class="speaker">
Zhongkui Ma, The University of Queensland
</div>
</summary>
<div class="abstract">
Neural networks have offered distinct advantages over traditional
techniques. However, the opaque neural networks render their
performance vulnerable, in the presence of adversarial samples. This
underscores the imperative of formal verification of neural networks,
especially in the safety-critical domain. Deep neural networks are
composed of multiple hidden layers with neurons, where mathematical
operations stack linear and nonlinear (activation functions)
operations. Using linear inequalities plays a crucial role in
constraining and deducing the ranges of results from nonlinear
operations. Considering correlations among input variables of multiple
neurons leads to constraints known as multi-neuron constraints, posing
a non-trivial high-dimensional challenge in computing the convex hull
of functions. This work is dedicated to designing methods for computing
multi-neuron constraints for the ReLU function to serve the robustness
verification of neural networks. We have introduced a novel approach
WRALU (Wrapping ReLU) for computing the convex hull of a ReLU function
(<a href='https://popl24.sigplan.org/details/POPL-2024-popl-research-papers/77/ReLU-Hull-Approximation'>published in POPL’24</a>).
This method significantly reduces computation
time and complexity, constructing fewer constraints to achieve more
precise approximations while handling higher dimensions.
</div>
</details>
<details>
<summary>
<time>
Apr 4, 2024
</time>
<div class="title">
Rigorous error analysis for logarithmic number systems
</div>
<div class="speaker">
Thanh Son Nguyen, University of Utah
</div>
</summary>
<div class="abstract">
Logarithmic Number Systems (LNS) hold considerable promise in
helping reduce the number of bits needed to represent a high
dynamic range of real-numbers with finite precision, and also
efficiently support multiplication and division. However,
under LNS, addition and subtraction turn into non-linear
functions that must be approximated—typically using
precomputed table-based functions. Additionally, multiple
layers of error correction are typically needed to improve
result accuracy. Unfortunately, previous efforts have not
characterized the resulting error bound. We provide the first
rigorous analysis of LNS, covering detailed techniques such as
co-transformation that are crucial to implementing subtraction
with reasonable accuracy. We provide theorems capturing the
error due to table interpolations, the finite precision of
pre-computed values in the tables, and the error introduced by
fix-point multiplications involved in LNS implementations. We
empirically validate our analysis using a Python
implementation, showing that our analytical bounds are tight,
and that our testing campaign generates inputs diverse-enough
to almost match (but not exceed) the analytical bounds. We
close with discussions on how to adapt our analysis to LNS
systems with different bases and also discuss many pragmatic
ramifications of our work in the broader arena of scientific
computing and machine learning.
</div>
</details>
<details>
<summary>
<time>
Mar 7, 2024
</time>
<div class="title">
Low-bit quantization for efficient and accurate LLM serving
</div>
<div class="speaker">
Chien-Yu Lin, University of Washington
</div>
</summary>
<div class="abstract">
The growing demand for Large Language Models (LLMs) in
applications such as content generation, intelligent chatbots,
and sentiment analysis poses considerable challenges for LLM
service providers. To efficiently use GPU resources and boost
throughput, batching multiple requests has emerged as a
popular paradigm; to further speed up batching, LLM
quantization techniques reduce memory consumption and increase
computing capacity. However, prevalent quantization schemes
(e.g., 8-bit weight-activation quantization) cannot fully
leverage the capabilities of modern GPUs, such as 4-bit
integer operators, resulting in sub-optimal performance.
To maximize LLMs' serving throughput, we introduce Atom, a
low-bit quantization method that achieves high throughput
improvements with negligible accuracy loss. Atom significantly
boosts serving throughput by using low-bit operators and
considerably reduces memory consumption via low-bit
quantization. It attains high accuracy by applying a novel
mixed-precision and fine-grained quantization process. We
evaluate Atom on 4-bit weight-activation quantization setups
in the serving context. Atom improves end-to-end throughput by
up to 7.73× compared to the FP16 and by 2.53× compared to INT8
quantization, while maintaining the same latency target.
</div>
</details>
<details>
<summary>
<time>
Feb 1, 2024
</time>
<div class="title">
Towards optimized multiplierless arithmetic circuits
</div>
<div class="speaker">
Rémi Garcia, University of Rennes
</div>
</summary>
<div class="abstract">
Multiplication is a basic operator used in many applications. When
implemented in embedded systems, e.g. FPGAs, these algorithms require
highly optimized hardware. Improving the multiplication implementation
is an important part of the final hardware cost reduction. This talk
will expose the recent advances on the automatic design of
multiplication of a variable with multiple <emph>a priori</emph> known
constants, this problem is called the Multiple Constant Multiplication
(MCM) problem. Using Integer Linear Programming (ILP) permits to
significantly reduce the hardware cost when MCM is implemented using
additions and bit-shifts. The ILP approach has been efficiently used
for various hardware design problems but other approaches exists, such
as SAT/SMT, Constraint Programming, etc. This presentation will
introduce a few concepts relating to this ecosystem.
</div>
</details>
<details>
<summary>
<time>
Dec 7, 2023
</time>
<div class="title">
Application-specific arithmetic
</div>
<div class="speaker">
Florent de Dinechin, INSA-Lyon/Inria; and Martin Kumm, Fulda University of Applied Sciences
</div>
</summary>
<div class="abstract">
A software designer has to carefully select the arithmetic that best
suits each step of her application, but the choice is constrained: only
a handful of operators are supported in hardware, in only in a handful
of precisions.
Conversely, when designing for hardware or FPGAs, these constraints
become degrees of freedom. Any precision can be designed, any number
encoding, but also any operation and function, provided you are clever
enough to implement it efficiently as a circuit.
<br>
This talk will expose some of the opportunities offered by this freedom,
and some of the methodologies and tools that allow a designer to build
and compose application-specific operators.
It may include some inadvertent advertising for our upcoming 800-page
book dedicated to these topics.
</div>
</details>
<details>
<summary>
<time>
Nov 2, 2023
</time>
<div class="title">
Overview of numerics on the Java Platform
</div>
<div class="speaker">
Joseph D. Darcy, Oracle
</div>
</summary>
<div class="abstract">
Come hear an overview of numeric support of the Java platform, an on-going
story ranging over more than a quarter century and spanning standards,
specifications, virtual machines, libraries, and compilers, including resolving
details as small as 2^-1074. The talk does not assume audience members will
already be familiar with the details of the Java platform; attendees are welcome
to ask questions about such details.
<br>
Joe is a long-time member on the JDK engineering team, first at Sun and later
Oracle. As “Java Floating-Point Czar” he has looked after Java numerics including
contributing to the design of strictfp, adding numerous floating-point math library
methods, and adding hexadecimal floating-point literals to the Java language and
library. Joe was a participant in and interim editor of the 2008 revision to the
IEEE 754 floating-point standard. Outside of numerics, Joe has done other
foundational work on the Java platform including core libraries development, Java
language changes including Project Coin, infrastructure improvements, and reviewing
platform interface updates, together with a smattering of project management and
release management along the way.
</div>
</details>
<details>
<summary>
<time>
Oct 5, 2023
</time>
<div class="title">
Custom elementary functions for hardware accelerators
</div>
<div class="speaker">
Benjamin Carleton and Adrian Sampson, Cornell University
</div>
</summary>
<div class="abstract">
This talk is a work-in-progress report about our efforts to
generate customized hardware implementations of fixed-point
function approximations for application-specific accelerators.
We built a new compiler from FPCore
to <a href="https://calyxir.org/">Calyx</a>, our lab’s IR and
compiler for translating high-level descriptions to hardware
designs. The interesting part is the need to generate
efficient polynomial approximations for elementary
functions—and the freedom to customize these implementations
for the specific accelerator’s context. We have preliminary
(but encouraging) comparisons to a commercial high-level
synthesis (HLS) compiler. We will seek feedback on the many
possibilities for next steps.
</div>
</details>
<details>
<summary>
<time>
Sep 7, 2023
</time>
<div class="title">
A PVS formalization of Taylor error expressions
</div>
<div class="speaker">
Jonas Katona, Yale University
</div>
</summary>
<div class="abstract">
Due to the limits of finite precision arithmetic, as one works
with floating-point representations of real numbers in
computer programs, round-off errors tend to accumulate under
mathematical operations and may become unacceptably large.
Symbolic Taylor Error Expressions is a technique used in the
tool FPTaylor to bound the round-off errors accumulated under
differentiable mathematical operations which provides a good
trade-off between efficiency and accuracy. In this
presentation, I will present a formally verified
implementation of Symbolic Taylor Error Expressions in a
specification language, automated theorem prover, and
typechecker called Prototype Verification System (PVS). I will
also go over how this formal specification in PVS will enable
the integration of Symbolic Taylor Expressions in PRECiSA, a
prototype static analysis tool that can compute verified
round-off error bounds for floating-point programs.
</div>
</details>
<details>
<summary>
<time>
Aug 3, 2023
</time>
<div class="title">
Automated reasoning over the reals
</div>
<div class="speaker">
Soonho Kong, Amazon Web Services
</div>
</summary>
<div class="abstract">
In cyber-physical systems such as autonomous driving systems
and robotics components, ordinary differential equations and
nonlinear functions such as the exponential function and the
trigonometric functions appear almost universally. To adopt
automated reasoning techniques in these domains, it is
necessary to develop a tool that can handle logical encodings
with those functions.
In this talk, I will present the design and implementation of
a delta-decision procedure, dReal. First, we will discuss the
theory of delta-decidability briefly. Then I will explain 1)
how the tool handles nonlinear functions and ordinary
differential equations, 2) how it can solve optimization and
synthesis problems (∃∀-formulas), and 3) how the tool utilizes
multiple cores to speed up the solving process in parallel. I
will show that the tool helps tackle instances from real-world
applications including bounded model-checking for
cyber-physical systems, nonlinear global optimization, and
Lyapunov-stability analysis in robotics.
</div>
</details>
<details>
<summary>
<time>
Jul 6, 2023
</time>
<div class="title">
<a href="talks/fptalks23.html">FPTalks 2023 annual workshop</a>
</div>
<div class="speaker">
12 speakers at the cutting edge of numerics research
</div>
</summary>
</details>
<details>
<summary>
<time>
Jun 1, 2023
</time>
<div class="title">
Floating-point education roundtable
</div>
<div class="speaker">
Mike Lam, James Madison University
</div>
</summary>
<div class="abstract">
Members of the FPBench community are well-aware of the
pitfalls of floating-point representation, but awareness among
the wider population of software authors is
often <a href="https://doi.org/10.1109/IPDPS.2018.00068">painfully
inadequate</a>. There is a surprising dearth of quality
educational materials and techniques for teaching these issues
to students (both undergraduate and graduate). In this
roundtable discussion, we’ll look at some of the current
approaches and brainstorm possible approaches to improve the
state of the art.
</div>
</details>
<details>
<summary>
<time>
May 4, 2023
</time>
<div class="title">
Floating-point accuracy anecdotes from real-world products
</div>
<div class="speaker">
Shoaib Kamil, Adobe Research
</div>
</summary>
<div class="abstract">
Floating point issues must often be surmounted when building
software products that rely on precise computations while
still demanding performance. In this talk, I describe two
scenarios where existing tools fall short. First, I will
describe our efforts to utilize interval computation to better
bound quantities representing real numbers, and the rabbit
hole we encountered when trying to find an existing
off-the-shelf cross-platform interval computation library.
From this exploration, we built a cross-platform interval
benchmark suite using hand-crafted expressions from real-world
code and from FPBench, and evaluated four existing interval
libraries for correctness, interval width, and speed. Second,
I will describe our experiences using GPU floating point math
to emulate integer computations, and the pitfalls even when
computing with exactly-representable floating point numbers.
</div>
</details>
<details>
<summary>
<time>
Apr 6, 2023
</time>
<div class="title">
Formal and semi-formal verification of C numerics
</div>
<div class="speaker">
Samuel D. Pollard and Ariel Kellison,
Sandia National Laboratories
</div>
</summary>
<div class="abstract">
For better or worse, the world runs on C, and to a lesser
extent, C with IEEE 754 floating point. At Sandia Labs, we
often are tasked with the formal verification of
high-consequence programs in C. Two methods we use at Sandia
to accomplish this verification are: fully constructive proofs
of correctness (in the Coq theorem prover), and deductive
verification (using Frama-C). In both cases, the addition of
numerical computations adds complexity to the programs, via
numerical error (e.g., discretization or roundoff error), or
run-time errors (such as floating-point exceptions). We
discuss our efforts at Sandia Labs to both make numerical
models of C programs using Frama-C and its ANSI C
specification language (ACSL), as well as functional models in
Coq, and how to check whether the relevant C code implements
these specifications.
</div>
</details>
<details>
<summary>
<time>
Mar 2, 2023
</time>
<div class="title">
Automatically generating numerical PDE solvers with Finch
</div>
<div class="speaker">
Eric Heisler,
University of Utah
</div>
</summary>
<div class="abstract">
<p>Finch is a domain-specific language for numerically solving differential equations. It employs a variety of numerical techniques, generates code for a number of different target frameworks and languages, and allows almost arbitrary equation input. In addition, it accepts unrestricted modification of the generated code to provide maximal flexibility and hand-optimization. One downside to this (perhaps excessive) flexibility is that numerical instability and the occasional typo can easily turn a solution into a pile of NaNs.
<p>Eric will introduce Finch, and demonstrate some of the features relevant to the numerics. We will look at some examples where things turn out just the way we hoped, and some where they don't.
</div>
</details>
<details>
<summary>
<time>
Dec 8, 2022
</time>
<div class="title">
Low-precision FP8 formats: background and industry status
</div>
<div class="speaker">
Marius Cornea, Intel
</div>
</summary>
<div class="abstract">
Marius Cornea, a senior principal engineer at Intel working on math libraries and floating-point (and one of the coauthors of the IEEE Standard 754-2008), will be presenting about new industry standards for low-precision formats. The presentation will examine the reasons for which FP8 formats have emerged, definition details, and variations proposed by industry and academia. We will enumerate a few known (mostly planned) implementations of FP8. We will also look at current standardization work for FP8 formats.
</div>
</details>
<details>
<summary>
<time>
Nov 18, 2022
</time>
<div class="title">
Correctness 2022 annual workshop
</div>
<div class="speaker">
Workshop on HPC software correctness, co-located with SC
</div>
</summary>
</details>
<details>
<summary>
<time>
Nov 3, 2022
</time>
<div class="title">
Function-Based volumetric design and fabrication
</div>
<div class="speaker">
Chris Uchytil,
Mechanical Engineering,
University of Washington
</div>
</summary>
<div class="abstract">
We present a novel computer-aided design (CAD) modeling system designed
to support a modeling range that matches the fabrication range of
modern additive manufacturing (AM) systems that are capable of
producing large-scale, high-resolution parts. To be useful to designers
and fabricators, a modeling system must perform essential functions
(such as execution of modeling operations, visualization, and slicing)
at interactive rates, and achieving the necessary performance depends
on efficient use of both memory and computing capacity. Our approach to
achieving necessary performance levels is to implement an implicit
function-based representation (f-rep) modeling system, not using a
computer algebra system, but instead using just-in-time (JIT)
compilation of user-defined functions. Efficient memory usage is
achieved via a sparse volume data structure that builds on previous
work by Hoetzlein [Hoetzlein 2016]. Computational efficiency is
achieved through a combination of interval arithmetic (IA) and
massively parallel evaluation on the GPU. We follow [Keeter 2020] and
employ IA as the basis for local pruning of the function evaluation
tree to minimize the required number of function evaluations, and we
take advantage of GPU-parallelism to significantly enhance
computational throughput.
</div>
</details>
<details>
<summary>
<time>
Oct 6, 2022
</time>
<div class="title">
Finding inputs that trigger GPU exceptions
</div>
<div class="speaker">
Ignacio Laguna,
Center for Applied Scientific Computing,
LLNL
</div>
</summary>
<div class="abstract">
Testing code for floating-point exceptions is crucial as exceptions can
quickly propagate and produce unreliable numerical answers. The
state-of-the-art to test for floating-point exceptions in GPUs is quite
limited and solutions require the application's source code, which
precludes their use in accelerated libraries where the source is not
publicly available. We present an approach to find inputs that trigger
floating-point exceptions in black-box GPU functions, i.e., functions
where the source code and information about input bounds are
unavailable. Our approach is the first to use Bayesian optimization
(BO) to identify such inputs and uses novel strategies to overcome the
challenges that arise in applying BO to this problem. We implement our
approach in the Xscope framework and demonstrate it on 58 functions
from the CUDA Math Library and functions from ten HPC programs. Xscope
is able to identify inputs that trigger exceptions in about 72% of the
tested functions.
</div>
</details>
<details>
<summary>
<time>
Sep 2, 2022
</time>
<div class="title">
Birds-of-a-feather
</div>
<div class="speaker">
The community discussed the challenges of
connecting various analysis tools
</div>
</summary>
</details>
<details>
<summary>
<time>
Aug 4, 2022
</time>
<div class="title">
Database workbench:
mixed-initiative design space search
</div>
<div class="speaker">
Edward Misback,
University of Washington
</div>
</summary>
<div class="abstract">
The Herbie web demo is intended as an accessible tool for making a
rough guess at the best way to improve the accuracy of a floating-point
expression. It aims to support students learning about floating point
error, casual users who just want a high-accuracy replacement
expression, and expert users who are comfortable interpreting its
results. However, it is designed as a single-step tool and leaves
little room for users to iterate or adapt it to numerical analysis
tasks, even though its underlying functions have broader applications.
We describe how an alternative “database workbench” interface will
allow users to leverage Herbie’s power while exploring the design space
of replacement expressions more flexibly, including support for the
development of third-party expression analysis plugins. We would like
to discuss with the FPBench community whether the workflow we describe
is compatible with other numerical analysis tools and needs, either as
plugins or as separate workbenches with a similar interface.
<a href='https://docs.google.com/presentation/d/1RQ3tDkAzkoFcCZiA2iDpSuHVEJ-aYESwzwvV8M4Ec_s/edit#slide=id.p'>Slides</a>
</div>
</details>
<details>
<summary>
<time>
Jul 6, 2022
</time>
<div class="title">
<a href='talks/fptalks22.html'>FPTalks 2022 annual workshop</a>
</div>
<div class="speaker">
11 speakers at the cutting edge of numerics research
</div>
</summary>
</details>
<details>
<summary>
<time>
May 5, 2022
</time>
<div class="title">
Automatic datapath optimization using e-graphs
</div>
<div class="speaker">
Samuel Coward,
Intel & Imperial College London
</div>
</summary>
<div class="abstract">
RTL development is hampered by low design space exploration and long
debug times. High Level Synthesis and ML techniques are not tackling
the nature of optimization RTL teams are doing in complex Graphics type
IP. This talk provides initial results on: automatically transforming
RTL , exploring the design space, how Intel GFx knowhow is exploited,
associated formal verification and how the foundation of the system is
using the latest e-graph technology.
</div>
</details>
<details>
<summary>
<time>
Apr 7, 2022
</time>
<div class="title">
Choosing function implementations for speed and accuracy
</div>
<div class="speaker">
Ian Briggs,
Computer Science,
University of Utah
</div>
</summary>
<div class="abstract">
Standard implementations of functions like sin and exp optimize for
accuracy, not speed, because they are intended for general-purpose use.
But just like many applications tolerate inaccuracy from cancellation,
rounding error, and singularities, many applications could also
tolerate less-accurate function implementations. This raises an
intriguing possibility: speeding up numerical code by using different
function implementations.
Enter OpTuner, an automated tool for selecting the best implementation
for each mathematical function call site. OpTuner uses error Taylor
series and integer linear programming to compute optimal assignments of
297 function implementations to call sites and presents the user with a
speed-accuracy Pareto curve. In a case study on the POV-Ray ray tracer,
OpTuner speeds up a critical computation by 2.48×, leading to a whole
program speedup of 1.09× with no change in the program output; human
efforts result in slower code and lower-quality output. On a broader
study of 36 standard benchmarks, OpTuner demonstrates speedups of 2.05×
for negligible decreases in accuracy and up to 5.37× for error-tolerant
applications.
</div>
</details>
<details>
<summary>
<time>
Mar 3, 2022
</time>
<div class="title">
CORE-MATH correctly rounded mathematical functions
</div>
<div class="speaker">
Paul Zimmerman, INRIA
</div>
</summary>
<div class="abstract">
The mission of the CORE-MATH project is to provide off-the-shelf
open-source mathematical functions with correct rounding that can be
integrated into current mathematical libraries (GNU libc, Intel Math
Library, AMD Libm, Newlib, OpenLibm, Musl, Apple Libm, llvm-libc, CUDA
libm, ROCm). This presentation covers first results in
single-precision (binary32).
<a href='https://members.loria.fr/PZimmermann/talks/'>Slides</a>
</div>
</details>
<details>
<summary>
<time>
Feb 3, 2022
</time>
<div class="title">
Formal verification of numerical Hamiltonian systems
</div>
<div class="speaker">
Ariel Kellison, Cornell University
</div>
</summary>
</details>