DATAWorks 2023 Agenda Day 3
Audience Legend
1
Everyone.These talks should be accessible to everyone regardless technical background.
2
Practitioners.These talks might include case studies with some discussion of methods and coding, but largely accessible to a non-technical audience.
3
Technical Experts.These talks will likely delve into technical details of methods and analytical computations and are primarily aimed at practitioners advancing the state of the art.
7:30 AM – 8:30 AM | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Check-in 8:30 AM – 8:40 AM
|
Room A+B ![]() Virtual Session
Opening Remarks Bram Lillard (IDA) Show Bio
V. Bram Lillard assumed the role of Director of the Operational Evaluation Division (OED) in early 2022. In this position, Bram provides strategic leadership, project oversight, and direction for the division’s research program, which primarily supports the Director, Operational Test and Evaluation (DOT&E) within the Office of the Secretary of Defense. He also oversees OED’s contributions to strategic studies, weapon system sustainment analyses, and cybersecurity evaluations for DOD and anti-terrorism technology evaluations for the Department of Homeland Security. Bram joined IDA in 2004 as a member of the research staff. In 2013-14, he was the acting science advisor to DOT&E. He then served as OED’s assistant director in 2014-21, ascending to deputy director in late 2021. Prior to his current position, Bram was embedded in the Pentagon where he led IDA’s analytical support to the Cost Assessment and Program Evaluation office within the Office of the Secretary of Defense. He previously led OED’s Naval Warfare Group in support of DOT&E. In his early years at IDA, Bram was the submarine warfare project lead for DOT&E programs. He is an expert in quantitative data analysis methods, test design, naval warfare systems and operations and sustainment analyses for Defense Department weapon systems. Bram has both a doctorate and a master’s degree in physics from the University of Maryland. He earned his bachelor’s degree in physics and mathematics from State University of New York at Geneseo. Bram is also a graduate of the Harvard Kennedy School’s Senior Executives in National and International Security program, and he was awarded IDA’s prestigious Goodpaster Award for Excellence in Research in 2017. 8:40 AM – 9:20 AM
|
Room A+B ![]() Virtual Session
Keynote 3 The Honorable Christine Fox (Johns Hopkins University Applied Physics Laboratory) Show Bio
The Honorable Christine Fox currently serves as a member of the President’s National Infrastructure Advisory Council, participates on many governance and advisory boards, and is a Senior Fellow at the Johns Hopkins University Applied Physics Laboratory. Previously, she was the Assistant Director for Policy and Analysis at JHU/APL, a position she held from 2014 to early 2022. Before joining APL, she served as Acting Deputy Secretary of Defense from 2013 to 2014 and as Director of Cost Assessment and Program Evaluation (CAPE) from 2009-2013. As Director, CAPE, Ms. Fox served as chief analyst to the Secretary of Defense. She officially retired from the Pentagon in May 2014. Prior to her DoD positions, she served as president of the Center for Naval Analyses from 2005 to 2009, after working there as a research analyst and manager since 1981. Ms. Fox holds a bachelor and master of science degree from George Mason University. 9:30 AM – 10:30 AM Room A+B
|
Featured Panel: AI Assurance Session Chair: Chad Bieber
This panel discussion will bring together an international group of AI Assurance Case experts from Academia and Government labs to discuss the challenges and opportunities of applying assurance cases to AI-enabled systems. The panel will discuss how assurance cases apply to AI-enabled systems, pitfalls in developing assurance cases, including human-system integration into the assurance case, and communicating the results of an assurance case to non-technical audiences. This panel discussion will be of interest to anyone who is involved in the development or use of AI systems. It will provide insights into the challenges and opportunities of using assurance cases to provide justified confidence to all stakeholders, from the AI users and operators to executives and acquisition decision-makers.
Featured Panel: AI Assurance
Laura Freeman (Virginia Tech National Security Institute) Dr. Laura Freeman is a Research Associate Professor of Statistics and the Deputy Director of the Virginia Tech National Security Institute. Her research leverages experimental methods for conducting research that brings together cyber-physical systems, data science, artificial intelligence (AI), and machine learning to address critical challenges in national security. She is also a hub faculty member in the Commonwealth Cyber Initiative and leads research in AI Assurance. She develops new methods for test and evaluation focusing on emerging system technology. She is also the Assistant Dean for Research for the College of Science, in that capacity she works to shape research directions and collaborations in across the College of Science. Previously, Dr. Freeman was the Assistant Director of the Operational Evaluation Division at the Institute for Defense Analyses. In that position, she established and developed an interdisciplinary analytical team of statisticians, psychologists, and engineers to advance scientific approaches to DoD test and evaluation. During 2018, Dr. Freeman served as that acting Senior Technical Advisor for Director Operational Test and Evaluation (DOT&E). As the Senior Technical Advisor, Dr. Freeman provided leadership, advice, and counsel to all personnel on technical aspects of testing military systems. She reviewed test strategies, plans, and reports from all systems on DOT&E oversight. Dr. Freeman has a B.S. in Aerospace Engineering, a M.S. in Statistics and a Ph.D. in Statistics, all from Virginia Tech. Her Ph.D. research was on design and analysis of experiments for reliability data.
Featured Panel: AI Assurance
Alec Banks (Defence Science and Technology Laboratory) Dr Alec Banks works as a Senior Principal Scientist at Defence Science and Technology Laboratory (Dstl). He has worked in defence engineering for over 30 years. His recent work has focused on the safety of software systems, including working as a software regulator for the UKs air platforms and developing software safety assurance for the UKs next generation submarines. In his regulatory capacity he has provided software certification assurance for platforms such as Scan Eagle, Watchkeeper and F-35 Lightning II. More recently, Alec has been the MOD lead on research in Test, Evaluation, Verification and Validation of Autonomy and AI-based systems; this has included revisions of the UK’s Defence Standard for software safety to facilitate greater use of models and simulations and the adoption of machine learning in higher-integrity applications.
Featured Panel: AI Assurance
Josh Poore (Applied Research Laboratory for Intelligence and Security, University of Maryland) Dr. Joshua Poore is an Associate Research Scientist with the Applied Research Laboratory for Intelligence and Security (ARLIS) at the University of Maryland where he supports the Artificial Intelligence, Autonomy, and Augmentation (AAA) mission area. Previously, Dr. Poore was a Principal Senior Scientist and Technology Development Manager at BAE Systems, FAST Labs (6/2018-2/2021), and Principal Scientist at Draper (03/2011-06/2018). Across his industry career, Dr. Poore was Technical Director or Principal Investigator for a wide range of projects focused on human-system integration and human-AI interaction funded by DARPA, IARPA, AFRL, and other agencies. Dr. Poore’s primary research foci are: (1) the use of ubiquitous software and distributed information technology as measurement mediums for human performance; (2) knowledge management or information architectures for reasoning about human system integration within and across systems. Currently, this work serves ARLIS’ AAA test-bed for Test and Evaluation activities. Dr. Poore also leads the development of open source technology aligned with this research as a committer and Project Management Committee member for the Apache Software Foundation.
Featured Panel: AI Assurance
John Stogoski (Software Engineering Institute, Carnegie Mellon University) John Stogoski has been at Carnegie Mellon University’s Software Engineering Institute for 10 years including roles in the CERT and AI Divisions. He is currently a senior systems engineer working with DoD sponsors to research how artificial intelligence can be applied to increase capabilities and build the AI engineering discipline. In his previous role, he oversaw a prototyping lab focused on evaluating emerging technologies and design patterns for addressing cybersecurity operations at scale. John spent a significant portion of his career at a major telecommunications company where he served in director roles responsible for the security operations center and then establishing a homeland security office after the 9/11 attack. He worked with government and industry counterparts to advance policy and enhance our coordinated, operational capabilities to lessen impacts of future attacks or natural disaster events. Applying lessons from the maturing of the security field, along with considering the unique aspects of artificial intelligence, can help us enhance the system development lifecycle and realize the opportunities increasing our strategic advantage. 10:30 AM – 10:50 AM
|
Break 10:50 AM – 12:20 PM: Parallel Sessions Room A ![]() Virtual Session
|
Session 4A: Transforming T&E to Assess Modern DoD Systems Session Chair: Kristen Alexander, DOT&E
DOT&E Strategic Initiatives, Policy, and Emerging Technologies (SIPET) Mission Brief
Jeremy Werner (DOT&E) Show Bio
Jeremy Werner, PhD, ST was appointed DOT&E’s Chief Scientist in December 2021 after initially starting at DOT&E as an Action Officer for Naval Warfare in August 2021. Before then, Jeremy was at Johns Hopkins University Applied Physics Laboratory (JHU/APL), where he founded a data science-oriented military operations research team that transformed the analytics of an ongoing military mission. Jeremy previously served as a Research Staff Member at the Institute for Defense Analyses where he supported DOT&E in the rigorous assessment of a variety of systems/platforms. Jeremy received a PhD in physics from Princeton University where he was an integral contributor to the Compact Muon Solenoid collaboration in the experimental discovery of the Higgs boson at the Large Hadron Collider at CERN, the European Organization for Nuclear Research in Geneva, Switzerland. Jeremy is a native Californian and received a bachelor’s degree in physics from the University of California, Los Angeles where he was the recipient of the E. Lee Kinsey Prize (most outstanding graduating senior in physics). SIPET, established in 2021, is a deputate within the office of the Director, Operational Test and Evaluation (DOT&E). DOT&E created SIPET to codify and implement the Director’s strategic vision and keep pace with science and technology to modernize T&E tools, processes, infrastructure, and workforce. That is, The mission of SIPET is to drive continuous innovation to meet the T&E demands of the future; support the development of a workforce prepared to meet the toughest T&E challenges; and nurture a culture of information exchange across the enterprise and update policy and guidance. SIPET proactively identifies current and future operational test and evaluation needs, gaps, and potential solutions in coordination with the Services and agency operational test organizations. Collaborating with numerous stakeholders, SIPET develops and refines operational test policy guidance that support new test methodologies and technologies in the acquisition and test communities. SIPET, in collaboration with the T&E community, is leading the development of the 2022 DOT&E Strategy Update Implementation Plan (I-Plan). I-Plan initiatives include:
T&E as a Continuum
Orlando Flores (OUSD(R&E)) Show Bio
Mr. Orlando Flores is currently the Chief Engineer within the Office of the Executive Director for Developmental Test, Evaluation, and Assessments (DTE&A). He serves as the principle technical advisor to the Executive Director, Deputies, Cybersecurity Technical Director, Systems Engineering and Technical Assistance (SETA) staff, and outside Federally Funded Research and Development Center (FFRDC) technical support for all DT&E and Systems Engineering (SE) matters. Prior to this he served as the Technical Director for Surface Ship Weapons within the Program Executive Office Integrated Warfare Systems (PEO IWS). Where he was responsible for providing technical guidance and direction for the Surface Ship Weapons portfolio of surface missiles, launchers and gun weapon systems. From June 2016 to June 2019 he served as the Director for Surface Warfare Weapons for the Deputy Assistant Secretary of the Navy for Ship Programs (DASN Ships), supporting the Assistant Secretary of the Navy for Research, Development and Acquisition. He was responsible for monitoring and advising DASN Ships on all matters related to surface weapons and associated targets. Prior to this from August 2009 through June 2016, Mr. Flores was the Deputy Project Manager and Lead Systems Engineer for the Standard Missile-6 Block IA (SM-6 Blk IA) missile program within PEO IWS. In this role he oversaw the management, requirements definition, design, development and fielding of one of the Navy’s newest surface warfare weapons. Beginning in 2002 and through 2009, Mr. Flores served in multiple capacities within the Missile Defense Agency (MDA). His responsibilities included functional lead for test and evaluation of the Multiple Kill Vehicle program; Modeling and Simulation development lead; Command and Control, Battle Management, and Communications systems engineer for the Kinetic Energy Interceptors program; and Legislative Liaison for all U.S. House of Appropriations Committee matters. In 2008 he was selected to serve as a foreign affairs analyst for the Deputy Assistant Secretary of Defense for Nuclear and Missile Defense Policy within the Office of the Under Secretary of Defense for Policy where he developed and oversaw policies, strategies, and concepts pertaining to U.S. Ballistic Missile Defense System operations and deployment across the globe. Mr. Flores began his federal career in 1998 as an engineering intern for the Naval Sea Systems Command (NAVSEA) within the Department of the Navy. In July 2000 Mr. Flores graduated from the engineering intern program and assumed the position of Battle Management, Command Control, and Communications (BMC3) systems engineer for the NTW program where he led the design and development of ship-based ballistic missile defense BMC3 systems through 2002. Mr. Flores graduated from New Mexico State University in 1998 where he earned a bachelor’s of science degree in Mechanical Engineering. He earned a master’s of Business Administration in 2003. Mr. Flores is a member of the Department of Defense Acquisition Professional Community and has achieved two Defense Acquisition Workforce Improvements Act certifications: Level III in Program Management and Level II in Systems Planning, Research, Development and Engineering. A critical change in how Test and Evaluation (T&E) supports capability delivery is needed to maintain our advantage over potential adversaries. Making this change requires a new paradigm in which T&E provides focused and relevant information supporting decision-making continually throughout capability development and informs decision makers from the earliest stages of Mission Engineering (ME) through Operations and Sustainment (O&S). This new approach improves the quality of T&E by moving from a serial set of activities conducted largely independently of Systems Engineering (SE) and ME activities to a new integrative framework focused on a continuum of activities termed T&E as a Continuum. T&E as a Continuum has three key attributes – capability and outcome focused testing; an agile, scalable evaluation framework; and enhanced test design – critical in the conduct of T&E and improving capability delivery. T&E as a Continuum builds off the 2018 DoD Digital Engineering (DE) Strategy’s five critical goals with three key enablers – robust live, virtual, and constructive (LVC) testing; developing model-based environments; and a “digital” workforce knowledgeable of the processes and tools associated with MBSE, model-based T&E, and other model-based processes. T&E as a Continuum improves the quality of T&E through integration of traditional SE and T&E activities, providing a transdisciplinary, continuous process coupling design evolution with VV&A.
NASEM Range Capabilities Study and T&E of Multi-Domain Operations
Hans Miller (MITRE) Show Bio
Hans Miller, Col USAF (ret), is a Chief Engineer for Research and Advanced Capabilities department at the MITRE Corporation. He retired with over 25 years of experience in combat operations, experimental flight test, international partnering, command and control, policy, and strategic planning of defense weapon systems. His last assignment was as Division Chief of the Policy, Programs and Resources Division, Headquarters Air Force Test and Evaluation Directorate at the Pentagon. He led a team responsible for Test and Evaluation policy throughout the Air Force, coordination with OSD and Joint Service counterparts, and staff oversight across the spectrum of all Air Force acquisition programs. Prior to that assignment, he was the Commander of the 96th Test Group, Holloman AFB, NM. The 96th Test Group conducted avionics and weapon systems flight tests, inertial navigation and Global Positioning System tests, high-speed test track operations and radar cross section tests necessary to keep joint weapon systems ready for war. Hans Miller was commissioned as a graduate of the USAF Academy. He has served as an operational and experimental flight test pilot in the B-1B and as an F-16 chase pilot. He flew combat missions in the B-1B in Operation Allied Force and Operation Enduring Freedom. He served as an Exercise Planning Officer at the NATO Joint Warfare Center, Stavanger, Norway. Col (ret) Miller was the Squadron Commander of the Global Power Bomber Combined Test Force coordinating ground and flight test activities on the B-1, B-2 and B-52. He served as the Director, Comparative Technology Office, within the Office of the Secretary of Defense. He managed the Department’s Foreign Comparative Testing, and Rapid Innovation Fund programs. Hans Miller is a Command Pilot with over 2100 hours in 35 different aircraft types. He is a Department of Defense Acquisition Corps member and holds Level 3 certification in Test and Evaluation. He is a graduate of the USAF Weapons School, USAF Test Pilot School, Air Command and Staff College and Air War College. He holds a bachelor’s degree in Aeronautical Engineering and a master’s degree in Aeronautical and Astronautical engineering from Stanford University. The future viability of DoD’s range enterprise depends on addressing dramatic changes in technology, rapid advances in adversary military capabilities, and the evolving approach the United States will take to closing kill chains in a Joint All Domain Operations environment. This recognition led DoD’s former Director of Operational Test and Evaluation (OT&E), the Honorable Robert Behler, to request that the National Academies of Science, Engineering and Medicine examine the physical and technical suitability of DoD’s ranges and infrastructure through 2035. The first half of this presentation will cover the highlights and key recommendations of this study, to include the need to create the “TestDevOps” digital infrastructure for future operational test and seamless range enterprise interoperability. The second half of this presentation looks at the legacy frameworks for the relationships of physical and virtual test capabilities, and how those frameworks are becoming outdated. This briefing explores proposals on how the interaction of operations, physical test capabilities, and virtual test capabilities need to evolve to support new paradigms of the rapidly evolving technologies and changing nature of multi-domain operations. Room B ![]() Virtual Session
|
Session 4B: Artificial Intelligence Methods & Current Initiatives Session Chair: Brian Vickers, IDA
Gaps in DoD National Artificial Intelligence Test and Evaluation Infrastructure Capabilities
Rachel Haga (IDA) Significant literature has been published in recent years calling for updated and new T&E infrastructure to allow for the credible, verifiable assessment of DoD AI-enabled capabilities (AIECs) . However, existing literature falls short in providing the detail necessary to justify investments in specific DoD Enterprise T&E infrastructure. The goal of this study was to collect data about current DoD programs with AIEC and corresponding T&E infrastructure to identify high priority investments by tracing AIECs to tailored, specific recommendations for enterprise AI T&E infrastructure. The study is divided into six bins of research, itemized below. This presentation provides an interim study update on the current state of programs with AIEC across DoD.
Assurance of Responsible AI/ML in the DOD Personnel Space
John Dennis (IDA) Show Bio
Dr. John W. Dennis, PhD, is a research staff member focusing on Econometrics, Statistics, and Data Science in the Institute for Defense Analyses’ Human Capital and Test Science groups. He received his PhD in Economics from the University of North Carolina at Chapel Hill in 2019. Testing and assuring responsible use of AI/ML enabled capabilities is a nascent topic in the DOD with many efforts being spearheaded by CDAO. In general, black box models tend to suffer from consequences related to edge cases, emergent behavior, misplaced or lack of trust, and many other issues, so traditional testing is insufficient to guarantee safety and responsibility in the employment of a given AI enabled capability. Focus of this concern tends to fall on well-publicized and high-risk capabilities, such as AI enabled autonomous weapons systems. However, while AI/ML enabled capabilities supporting personnel processes and systems, such as algorithms used for retention and promotion decision support, tend to carry low safety risk, many concerns, some of them specific to the personnel space, run the risk of undermining the DOD’s 5 ethical principles for RAI. Examples include service member privacy concerns, invalid prospective policy analysis, disparate impact against marginalized service member groups, and unintended emergent service member behavior in response to use of the capability. Eroding barriers to use of AI/ML are facilitating an increasing number of applications while some of these concerns are still not well understood by the analytical community. We consider many of these issues in the context of an IDA ML enabled capability and propose mechanisms to assure stakeholders of the adherence to the DOD’s ethical principles.
CDAO Joint AI Test Infrastructure Capability
David Jin (MITRE) Show Bio
David Jin is the AI Test Tools Lead at the Chief Digital and AI Office. Within this role, he leads the Joint AI Test Infrastructure Capability Program which is developing software tools for rigorous AI algorithmic testing. His background is in computer vision and pure mathematics. The Chief Digital and AI Office (CDAO) Test & Evaluation Directorate is developing the Joint AI Test Infrastructure Capability (JATIC) program of record, which is an interoperable set of state-of-the-art software capabilities for AI Test & Evaluation. It aims to provide a provide a comprehensive suite of integrated testing tools which can be deployed widely across the enterprise to address key T&E gaps. In particular, JATIC will capabilities will support the assessment of AI system performance, cybersecurity, adversarial resilience, and explainability – enabling the end-user to more effectively execute their mission. It is a key component of the digital testing infrastructure that the CDAO will provide in order to support the development and deployment of data, analytics, and AI across the Department. Room C
|
Session 4C: Applications of Machine Learning and Uncertainty Quantification Session Chair: Jonathan Rathsam, NASA Langley Research Center
Advanced Automated Machine Learning System for Cybersecurity
Himanshu Dayaram Upadhyay (Florida International University) Show Bio
Dr. Himanshu Upadhyay is serving Florida International University’s Applied Research Center for the past 21 years, leading the Artificial Intelligence / Cybersecurity / Big Data research group. He is currently working as Associate Professor in Electrical & Computer Engineering teaching Artificial Intelligence and Cybersecurity courses. He is mentoring DOE Fellows, AI Fellows, Cyber Fellows, undergraduate and graduate students supporting multiple cybersecurity & AI research projects from various federal agencies. He brings more than 30 years of experience in artificial intelligence/machine learning, big data, cybersecurity, information technology, management and engineering to his role, serving as co-principal investigator for multimillion – dollar cybersecurity and artificial intelligence projects for the Department of Defense and Defense Intelligence Agency. He is also serving as co-principal investigator for Department of Energy’s Office of Environmental Management research projects focused on knowledge/waste management, cybersecurity, artificial intelligence and big data technologies. He has published multiple papers in the area of cybersecurity, machine learning, deep learning, big data, knowledge / nuclear waste management and service-oriented architecture. His current research focuses on artificial intelligence, machine learning, deep learning, cyber security, big data, cyber analytics/visualization, cyber forensics, malware analysis and blockchain. He has architected a range of tiered and distributed application system to address strategic business needs, managing a team of researchers and scientists building secured enterprise information systems. Florida International University (FIU) has developed an Advanced Automated Machine Learning System (AAMLS) under the sponsored research from Department of Defense – Test Resource Management Center (DOD-TRMC), to provide Artificial Intelligence based advanced analytics solutions in the area of Cyber, IOT, Network, Energy, Environment etc. AAMLS is a Rapid Modeling & Testing Tool (RMTT) for developing machine learning and deep learning models in few steps by subject matter experts from various domains with minimum machine learning knowledge using auto & optimization workflows. AAMLS allows analysis of data collected from different test technology domains by using machine learning / deep learning and ensemble learning approaches to generate models, make predictions, then apply advanced analytics and visualization to perform analysis. This system enables automated machine learning using AI based Advanced Analytics and the Analytics Control Center platforms by connecting to multiple Data Sources. Artificial Intelligence based Advanced Analytics Platform: This platform is the analytics engine of AAML which provides pre-processing, feature engineering, model building and predictions. Primary components of this platform include:
Uncertainty Aware Machine Learning for Accelerators
Malachi Schram (Thomas Jefferson National Accelerator Facility) Show Bio
Dr. Malachi Schram is the head of the data scientist department at the Thomas Jefferson National Accelerator Facility. His research spans large scale distributed computing, applications for data science, and developing new techniques and algorithms in data science. His current research is focused on uncertainty quantification for deep learning and new techniques for design and control. Standard deep learning models for classification and regression applications are ideal for capturing complex system dynamics. Unfortunately, their predictions can be arbitrarily inaccurate when the input samples are not similar to the training data. Implementation of distance aware uncertainty estimation can be used to detect these scenarios and provide a level of confidence associated with their predictions. We present results using Deep Gaussian Process Approximation (DGPA) methods for 1) anomaly detection at Spallation Neutron Source (SNS) accelerator and 2) uncertainty aware surrogate model for the Fermi National Accelerator Lab (FNAL) Booster Accelerator Complex.
Well-Calibrated Uncertainty Quantification for Language Models in the Nuclear Domain
Karl Pazdernik (Pacific Northwest National Laboratory) Show Bio
Dr. Karl Pazdernik is a Senior Data Scientist within the National Security Directorate at Pacific Northwest National Laboratory (PNNL), a team lead within the Foundational Data Science group at PNNL, and a Research Assistant Professor at North Carolina State University. He is the program lead for the Open-Source Data Analytics program and a principal investigator on projects that involve disease modeling and image segmentation for materials science. His research has focused on the uncertainty quantification and dynamic modeling of multi-modal data with a particular interest in text analytics, spatial statistics, pattern recognition, anomaly detection, Bayesian statistics, and computer vision applied to financial data, networks, combined open-source data, disease prediction, and nuclear materials. He received a B.A. in Mathematics from Saint John’s University, a Ph.D. in Statistics from Iowa State University, and was a postdoctoral scholar at North Carolina State University under the Consortium for Nonproliferation Enabling Capabilities. A key component of global and national security in the nuclear weapons age is the proliferation of nuclear weapons technology and development. A key component of enforcing this non-proliferation policy is developing an awareness of the scientific research being pursued by other nations and organizations. To support non-proliferation goals and contribute to nuclear science research, we trained a RoBERTa deep neural language model on a large set of U.S. Department of Energy Office of Science and Technical Information (OSTI) research article abstracts and then finetuned this model for classification of scientific abstracts into 60 disciplines, which we call NukeLM. This multi-step approach to training improved classification accuracy over its untrained or partially out-of-domain competitors. While it is important for classifiers to be accurate, there has also been growing interest in ensuring that classifiers are well-calibrated with uncertainty quantification that is understandable to human decision-makers. For example, in the multiclass problem, classes with a similar predicted probability should be semantically related. Therefore, we also introduced an extension of the Bayesian belief matching framework proposed by Joo et al. (2020) that easily scales to large NLP models, such as NukeLM, and better achieves the desired uncertainty quantification properties. Room D
|
Spotlight on Data Literacy Session Chair: Matthew Avery, IDA
Data Literacy Within the Department of Defense
Nicholas Clark (United States Military Academy) Show Bio
COL Nicholas Clark is an Associate Professor in the Department of Mathematical Sciences at West Point where he is the Program Director for West Point’s Applied Statistics and Data Science Program. Nick received a BS in Mathematics from West Point in 2002, a MS in Statistics from George Mason in 2010, and a PhD in Statistics from Iowa State University in 2018. His dissertation was on Self-Exciting Spatio-Temporal Statistical Models and he has published in a variety of disciplines including spatio-temporal statistics, best practices in statistical methodologies, epidemiology, and sports statistics. Nick is the former director of the Center for Data Analysis and Statistics, where he conducted research for a variety of Department of Defense clients. COL Clark served as the Chief Data Scientist for JSOC while on sabbatical from June 2021 – June 2022. While in this role he created the Army’s Data Literacy 101 course teaching the fundamentals of Data Literacy to Army soldiers, civilians and contractors. Since inception, he and his team have now delivered the course over 30 times to a wide range of Army organizations. Data literacy, the ability to read, write, and communicate data in context, is fundamental for military organizations to create a culture where data is appropriately used to inform both operational and non-operational decisions. However, oftentimes organizations outsource data problems to outside entities and rely on a small cadre of data experts to tackle organizational problems. In this talk we will argue that data literacy is not solely the role or responsibility of the data expert. Ultimately, if experts develop tools and analytics that Army decision makers cannot use, or do not effectively understand the way the Army makes decisions, the Army is no more data rich than if it had no data at all. While serving on a sabbatical as the Chief Data Scientist for Joint Special Operations Command, COL Nick Clark (Department of Mathematical Sciences, West Point), noticed that a lack of basic data literacy skills was a major limitation to creating a data centric organization. As a result of this, he created 10 hours of training focusing on the fundamentals of data literacy. After delivering the course to JSOC, other DoD organizations began requesting the training. In response to this, a team from West Point joined with Army Talent Management Task Force to create mobile training teams. The teams have now delivered the training over 30 times to organizations ranging from tactical units up to strategic level commands. In this talk, we discuss what data literacy skills should be taught to the force and highlight best practices in educating soldiers, civilians, and contractors on the basics of data literacy. We will finally discuss strategies for assessing organizational Data Literacy and provide a framework for attendees to assess their own organizations data strengths and weaknesses. Room E ![]() Virtual Session
|
Mini-Tutorial 4 Session Chair: Gina Sigler, STAT COE
An Overview of Methods, Tools, and Test Capabilities for T&E of Autonomous Systems
Leonard Truett and Charlie Middleton (STAT COE) Show Bio
Dr. Truett has been a member of the Scientific Test and Analysis Techniques Center of Excellence (STAT COE) located at WPAFB, OH since 2012 and is currently the is the Senior STAT Expert. He began his career as a civilian for the 46th Test Wing supporting Live Fire Test and Evaluation (LFT&E) specializing in fire and explosion suppression for Aircraft. He has also worked for the Institute for Defense Analyses (IDA) supporting the Director, Operational Test and Evaluation (DOT&E) in LFT&E and Operational Test and Evaluation (OT&E) for air systems. He holds a Bachelor’s of Science in Aerospace Engineering and a Master’s of Science in Aerospace Engineering from the Georgia Institute of Technology, and Doctorate of Aerospace Engineering from The University of California, San Diego. Charlie Middleton currently leads the Advancements in Test and Evaluation (T&E) of Autonomous Systems team for the OSD STAT Center of Excellence. His responsibilities include researching autonomous system T&E methods and tools; collaborating with Department of Defense program and project offices developing autonomous systems; leading working groups of autonomy testers, staffers, and researchers; and authoring a handbook, reports, and papers related to T&E of autonomous systems. Previously, Mr. Middleton led development of a live-fire T&E risk-based framework for survivability and lethality evaluation for the office of the Director, Operational T&E; led a multi-domain modeling and simulation team supporting Air Force Research Labs future space efforts; and developed a Bayesian reliability analysis toolset for the National Air and Space Intelligence Center. While an active-duty Air Force officer, he was a developmental and operational test pilot leading several aircraft and weapons T&E programs and projects, and piloted 291 combat hours in the F-16 aircraft, employing precision munitions in Close Air Support, Time-Sensitive Targeting, and Suppression of Enemy Air Defense combat missions. Mr. Middleton is a distinguished graduate of the U.S. Naval Test Pilot School, and holds undergraduate and graduate degrees in operations research and operations analysis from Princeton University and the Air Force Institute of Technology. This tutorial will give an overview of selected methodologies, tools and test capabilities discussed in the draft “Test and Evaluation Companion Guide for Autonomous Military Systems.” This test and evaluation (T&E) companion guide is being developed to provide guidance to test and evaluation practitioners, to include program managers, test planners, test engineers, and analysts with test strategies, applicable methodologies, and tools that will help to improve rigor in addressing the challenges unique to the T&E of autonomy. It will also cover selected capabilities of test laboratories and ranges that support autonomous systems. The companion guide is intended to be a living document contributed by the entire community and will adapt to ensure the right information reaches the right audience. 12:20 PM – 1:30 PM
|
Lunch 1:30 PM – 3:00 PM: Parallel Sessions Room A ![]() Virtual Session
|
Session 5A: Methods and Tools at National Labs Session Chair: Karl Pazdernik, Pacific Northwest National Laboratory
Test and Evaluation Methods for Authorship Attribution and Privacy Preservation
Emily Saldanha (Pacific Northwest National Laboratory) Show Bio
Dr. Emily Saldanha is a research scientist in the Data Science and Analytics group of the National Security Directorate at Pacific Northwest National Laboratory. Her work focuses on developing machine learning, deep learning, and natural language processing methods for diverse applications with the aim to extract information and patterns from complex and multimodal datasets with weak and noisy signals. Her research efforts have spanned application areas ranging from energy technologies to computational social science. She received her Ph.D. in physics from Princeton University in 2016, where her work focused on the development and application of calibration algorithms for microwave sensors for cosmological observations. The aim of the IARPA HIATUS program is to develop explainable systems for authorship attribution and author privacy preservation through the development of feature spaces which encode the distinguishing stylistic characteristics of authors independently of text genre, topic, or format. In this talk, I will discuss progress towards defining an evaluation framework for this task to provide robust insights into system strengths, weaknesses, and overall performance. Our evaluation strategy includes the use of an adversarial framework between attribution and privacy systems, development of a focused set of core metrics, analysis of system performance dependencies on key data factors, systematic exploration of experimental variables to probe targeted questions about system performance, and investigation of key trade-offs between different performance measures.
Tools for Assessing Machine Learning Models’ Performance in Real-World Settings
Carianne Martinez (Sandia National Laboratories) Show Bio
Cari Martinez is a Principal Computer Scientist in the Applied Machine Intelligence Department at Sandia National Laboratories. She is a technical lead for a team that focuses on applied deep learning research to benefit Sandia’s mission across a diverse set of science and engineering disciplines. Her research focuses on improving deep learning modeling capabilities with domain knowledge, uncertainty quantification, and explainability techniques. Cari’s work has been applied to modeling efforts in several fields such as materials science, engineering science, structural dynamics, chemical engineering, and healthcare. Machine learning (ML) systems demonstrate powerful predictive capability, but fielding such systems does not come without risk. ML can catastrophically fail in some scenarios, and in the absence of formal methods to validate most ML models, we require alternative methods to increase trust. While emerging techniques for uncertainty quantification and model explainability may seem to lie beyond the scope of many ML projects, they are essential tools for understanding deployment risk. This talk will share a practical workflow, useful tools, and lessons learned for ML development best practices. Sandia National Laboratories is a multimission laboratory managed and operated by National Technology & Engineering Solutions of Sandia, LLC, a wholly owned subsidiary of Honeywell International Inc., for the U.S. Department of Energy’s National Nuclear Security Administration under contract DE-NA0003525. SAND2023-11982A
Perspectives on T&E of ML for Assuring Reliability in Safety-Critical Applications
Pradeep Ramuhalli (Oak Ridge National Laboratory) Show Bio
Dr. Pradeep Ramuhalli is a group lead for the Modern Nuclear Instrumentation and Controls group and a Distinguished R&D Scientist at Oak Ridge National Laboratory (ORNL). He leads a group with experience in measurement and data science applications to a variety of complex engineered systems. His research focus is on the development of sensor technologies for extreme environment and the integration of data from these sensors with data analytics technologies for prognostic health management and operational decision making. He also leads research on ML for enabling robust engineered systems (AIRES – AI for Robust Engineering and Science) as part of an internal research initiative at ORNL. Artificial intelligence (AI) and Machine Learning (ML) are increasingly being examined for their utility in many domains. AI/ML solutions are being proposed for a broad set of applications, including surrogate modeling, anomaly detection, classification, image segmentation, control, etc. A lot of effort is being put into evaluating these solutions for robustness, especially in the context of safety critical applications. While traditional methods of verification and validation continue to be necessary, challenges exist in many safety critical applications given the limited ability to gather data covering all possible conditions, and limited ability to conduct experiments. This presentation will discuss potential approaches for testing and evaluating machine learning algorithms in such applications, as well as metrics for this purpose. Room B ![]() Virtual Session
|
Session 5B: Applications of Bayesian Statisics Session Chair: Victoria Sieck, STAT COE
A Bayesian Decision Theory Framework for Test & Evaluation
James Ferry (Metron, Inc.) Show Bio
Dr. James Ferry has been developing Bayesian analytics at Metron for 18 years. He has been the Principal Investigator for a variety of R&D projects that apply Bayesian methods to data fusion, network science, and machine learning. These projects range from associating disparate data types from multiple sensors for missile defense, developing methods to track hidden structures on dynamically changing networks, computing incisive analytics efficiently from information in large databases, and countering adversarial attacks on neural-network-based image classifiers. Dr. Ferry was active in the network science community in the 2010’s. He organized a full-day special session on network science at FUSION 2015, co-organized WIND 2016 (Workshop on Incomplete Network Data), and organized a multi-day session on the Frontiers of Networks at the MORS METSM meeting in December 2016. Since then, his focus has been Bayesian analytics and network science algorithms for the Intelligence Community. Prior to Metron, Dr. Ferry was a computational fluid dynamicist. He developed models and supercomputer simulations of the multiphase fluid dynamics of rocket engines at the Center for Simulation of Advanced Rockets at UIUC. He has 30+ technical publications in fluid dynamics, network science, and Bayesian analytics. He holds a B.S. in Mathematics from M.I.T. and a Ph.D. in Applied Mathematics from Brown University. Decisions form the core of T&E: decisions about which tests to conduct and, especially, decisions on whether to accept or reject a system at its milestones. The traditional approach to acceptance is based on conducting tests under various conditions to ensure that key performance parameters meet certain thresholds with the required degree of confidence. In this approach, data is collected during testing, then analyzed with techniques from classical statistics in a post-action report. This work explores a new Bayesian paradigm for T&E based on one simple principle: maintaining a model of the probability distribution over system parameters at every point during testing. In particular, the Bayesian approach posits a distribution over parameters prior to any testing. This prior distribution provides (a) the opportunity to incorporate expert scientific knowledge into the inference procedure, and (b) transparency regarding all assumptions being made. Once a prior distribution is specified, it can be updated as tests are conducted to maintain a probability distribution over the system parameters at all times. One can leverage this probability distribution in a variety of ways to produce analytics with no analog in the traditional T&E framework. In particular, having a probability distribution over system parameters at any time during testing enables one to implement an optimal decision-making procedure using Bayesian Decision Theory (BDT). BDT accounts for the cost of various testing options relative to the potential value of the system being tested. When testing is expensive, it provides guidance on whether to conserve resources by ending testing early. It evaluates the potential benefits of testing for both its ability to inform acceptance decisions and for its intrinsic value to the commander of an accepted system. This talk describes the BDT paradigm for T&E and provides examples of how it performs in simple scenarios. In future work we plan to extend the paradigm to include the features, the phenomena, and the SME elicitation protocols necessary to address realistic T&E cases.
Saving hardware, labor, and time using Bayesian adaptive design of experiments
Daniel Ries (Sandia National Laboratories) Show Bio
Daniel Ries is a Senior Member of the Technical Staff at Sandia National Laboratories, where he has been since 2018. His roles at Sandia include statistical test engineer, technical researcher, project manager, and a short stint as acting manager of the Statistical Sciences Department. His work includes developing explainable AI to solve national security problems, applying Bayesian methods to include uncertainty quantification in solutions, and provide test and analysis support to weapon modernization programs. Daniel also serves as an Adjunct Professor at the University of Illinois Urbana-Champaign as an instructor and mentor for statistics majors interested in pursuing a data science career in the national security enterprise. Daniel received his PhD in statistics from Iowa State University in 2017. Physical testing in the national security enterprise is often costly. Sometimes this is driven by hardware and labor costs, other times it can be driven by finite resources of time or hardware builds. Test engineers must make the most of their available resources to answer high consequence problems. Bayesian adaptive design of experiments (BADE) is one tool that should be in an engineer’s toolbox for designing and running experiments. BADE is sequential design of experiment approach which allows early stopping decisions to be made in real time using predictive probabilities (PP), allowing for more efficient data collection. BADE has seen successes in clinical trials, another high consequence arena, and it has resulted in quicker and more effective assessments of drug trials. BADE has been proposed for testing in the national security space for similar reasons of quicker and cheaper test series. Given the high-consequence nature of the tests performed in the national security space, a strong understanding of new methods is required before being deployed. The main contribution of this research is to assess the robustness of PP in a BADE under different modeling assumptions, and to compare PP results to its frequentist alternative, conditional power (CP). Comparisons are made based on Type I error rates, statistical power, and time savings through average stopping time. Simulation results show PP has some robustness to distributional assumptions. PP also tends to control Type I error rates better than CP, while maintaining relatively strong power. While CP usually recommends stopping a test earlier than PP, CP also tends to have more inconsistent results, again showing the benefits of PP in a high consequence application. An application to a real problem from Sandia National Laboratories shows the large potential cost savings for using PP. The results of this study suggest BADE can be one piece of an evidence package during testing to stop testing early and pivot, in order to decrease costs and increase flexibility. Sandia National Laboratories is a multimission laboratory managed and operated by National Technology & Engineering Solutions of Sandia, LLC, a wholly owned subsidiary of Honeywell International Inc., for the U.S. Department of Energy’s National Nuclear Security Administration under contract DE-NA0003525.
Uncertainty Quantification of High Heat Microbial Reduction for NASA Planetary Protection
Michael DiNicola (Jet Propulsion Laboratory, California Institute of Technology) Show Bio
Michael DiNicola is a senior systems engineer in the Systems Modeling, Analysis & Architectures Group at the Jet Propulsion Laboratory (JPL). At JPL, Michael has worked on several mission concept developments and flight projects, including Europa Clipper, Europa Lander and Mars Sample Return, developing probabilistic models to evaluate key mission requirements, including those related to planetary protection, and infuse this modeling into trades throughout formulation of the mission concepts. He works closely with microbiologists in the Planetary Protection group to model assay and sterilization methods, and applies mathematical and statistical methods to improve Planetary Protection engineering practices at JPL and across NASA. At the same time, he also works with planetary scientists to characterize the plumes of Enceladus in support of future mission concepts. Michael earned his B.S. in Mathematics from the University of California, Los Angeles and M.A. in Mathematics from the University of California, San Diego. Planetary Protection is the practice of protecting solar system bodies from harmful contamination by Earth life and protecting Earth from possible life forms or bioactive molecules that may be returned from other solar system bodies. Microbiologists and engineers at NASA’s Jet Propulsion Laboratory (JPL) design microbial reduction and sterilization protocols that reduce the number of microorganisms on spacecraft or eliminate them entirely. These protocols are developed using controlled experiments to understand the microbial reduction process. Many times, a phenomenological model (such as a series of differential equations) is posited that captures key behaviors and assumptions of the process being studied. A Sterility Assurance Level (SAL) – the probability that a product, after being exposed to a given sterilization process, contains one or more viable organisms – is a standard metric used to assess risk and define cleanliness requirements in industry and for regulatory agencies. Experiments performed to estimate the SAL of a given microbial reduction or sterilization protocol many times have large uncertainties and variability in their results even under rigorously implemented controls that, if not properly quantified, can make it difficult for experimenters to interpret their results and can hamper a credible evaluation of risk by decision makers. In this talk, we demonstrate how Bayesian statistics and experimentation can be used to quantify uncertainty in phenomenological models in the case of microorganism survival under short-term high heat exposure. We show how this can help stakeholders make better risk-informed decisions and avoid the unwarranted conservatism that is often prescribed when processes are not well understood. The experiment performed for this study employs a 6 kW infrared heater to test survivability of heat resistant Bacillus canaveralius 29669 to temperatures as high as 350 °C for time durations less than 30 sec. The objective of this study was to determine SALs for various time-temperature combinations, with a focus on those time-temperature pairs that give a SAL of 10^-6. Survival ratio experiments were performed that allow estimation of the number of surviving spores and mortality rates characterizing the effect of the heat treatment on the spores. Simpler but less informative fraction-negative experiments that only provide a binary sterile/not-sterile outcome were also performed once a sterilization temperature regime was established from survival ratio experiments. The phenomenological model considered here is a memoryless mortality model that underlies many heat sterilization protocols in use today. This discussion and poster will outline how the experiment and model were brought together to determine SALs for the heat treatment under consideration. Ramifications to current NASA planetary protection sterilization specifications and current missions under development such as Mars Sample Return will be discussed. This presentation/poster is also relevant to experimenters and microbiologists working on military and private medical device applications where risk to human life is determined by sterility assurance of equipment. Room C
|
Session 5C: Methods and Tools for T&E Session Chair: Sam McGregor, AFOTEC
Confidence Intervals for Derringer and Suich Desirability Function Optimal Points
Peter Calhoun (HQ AFOTEC) Show Bio
Peter Calhoun received the B.S degree in Applied Mathematics from the University of New Mexico, M.S. in Operations Research from the Air Force Institute of Technology (AFIT), and Ph.D in Applied Mathematics from AFIT. He has been an Operations Research Analyst with the United States Air Force since 2017. He is currently an Operational Test Analyst at HQ AFOTEC. His research interests are analysis of designed experiments, multivariate statistics, and response surface methodology. A shortfall of the Derringer and Suich (1980) desirability function for multi-objective optimization has been a lack of inferential methods to quantify uncertainty. Most articles for addressing uncertainty involve robust methods, providing a point estimate that is less affected by variation. Few articles address confidence intervals or bands but not specifically for the widely used Derringer and Suich method. 8 methods are presented to construct 100(1-alpha) confidence intervals around Derringer and Suich desirability function optimal values. First order and second order models using bivariate and multivariate data sets are used as examples to demonstrate effectiveness. The 8 proposed methods include a simple best/worst case method, 2 generalized methods, 4 simulated surface methods, and a nonparametric bootstrap method. One of the generalized methods, 2 of the simulated surface methods, and the nonparametric method account for covariance between the response surfaces. All 8 methods seem to perform decently on the second order models; however, the methods which utilize an underlying multivariate-t distribution, Multivariate Generalized (MG) and Multivariate t Simulated Surface (MVtSSig) are recommended methods from this research as they perform well with small samples for both first order and second order models with coverage only becoming unreliable at consistently non-optimal solutions. MG and MVtSSig inference could also be used in conjunction with robust methods such as Pareto Front Optimization to help ascertain which solutions are more likely to be optimal before constructing confidence interval.
Skyborg Data Pipeline
Alexander Malburg (AFOTEC/EX) Show Bio
AFOTEC/EX Data Analyst The purpose of the Skyborg Data Pipeline is to allow for the rapid turnover of flight data collect during a test event, using collaborative easily access tool sets available in the AFOTEC Data Vault. Ultimately the goal of this data pipeline is to provide a working up to date dashboard that leadership can utilize shortly after a test event.
Circular Error Probable and an Example with Multilevel Effects
Jacob Warren (Marine Corps Operational Test and Evaluation Activity) Show Bio
Jacob Warren is the Assistant Scientific Advisor for the Marine Corps Operational Test and Evaluation Activity (MCOTEA). He was worked for MCOTEA since 2011 starting as a statistician before moving into his current role. Mr. Warren has a Master of Science degree in Applied Statistics from the Rochester Institute of Technology. Circular Error Probable (CEP) is a measure of a weapon system’s precision developed based on the Bivariate Normal Distribution. Failing to understanding the theory behind CEP can result in misuse of equations developed to help estimation. Estimation of CEP is also much more straightforward given situations such as single samples where factors are not being manipulated. This brief aims to help build a theoretical understanding of CEP, and then presents a non-trivial example in which CEP is estimated via multilevel regression. The goal is to help build an understanding of CEP so it can be properly estimated in trivial (single sample) and non-trivial cases (e.g. regression and multilevel regression). Room E ![]() Virtual Session
|
Mini-Tutorial 5 Session Chair: John Haman, IDA
The Automaton General-Purpose Data Intelligence Platform
Jeremy Werner (DOT&E) Show Bio
Jeremy Werner, PhD, ST was appointed DOT&E’s Chief Scientist in December 2021 after initially starting at DOT&E as an Action Officer for Naval Warfare in August 2021. Before then, Jeremy was at Johns Hopkins University Applied Physics Laboratory (JHU/APL), where he founded a data science-oriented military operations research team that transformed the analytics of an ongoing military mission. Jeremy previously served as a Research Staff Member at the Institute for Defense Analyses where he supported DOT&E in the rigorous assessment of a variety of systems/platforms. Jeremy received a PhD in physics from Princeton University where he was an integral contributor to the Compact Muon Solenoid collaboration in the experimental discovery of the Higgs boson at the Large Hadron Collider at CERN, the European Organization for Nuclear Research in Geneva, Switzerland. Jeremy is a native Californian and received a bachelor’s degree in physics from the University of California, Los Angeles where he was the recipient of the E. Lee Kinsey Prize (most outstanding graduating senior in physics). The Automaton general-purpose data intelligence platform abstracts data analysis out to a high level and automates many routine analysis tasks while being highly extensible and configurable – enabling complex algorithms to elucidate mission-level effects. Automaton is built primarily on top of R Project and its features enable analysts to build charts and tables, calculate aggregate summary statistics, group data, filter data, pass arguments to functions, generate animated geospatial displays for geospatial time series data, flatten time series data into summary attributes, fit regression models, create interactive dashboards, and conduct rigorous statistical tests. All of these extensive analysis capabilities are automated and enabled from an intuitive configuration file requiring no additional software code. Analysts or software engineers can easily extend Automaton to include new algorithms, however. Automaton’s development was started at Johns Hopkins University Applied Physics Laboratory in 2018 to support an ongoing military mission and perform statistically rigorous analyses that use Bayesian-inference-based Artificial Intelligence to elucidate mission-level effects. Automaton has unfettered Government Purpose Rights and is freely available. One of DOT&E’s strategic science and technology thrusts entails automating data analyses for Operational Test & Evaluation as well as developing data analysis techniques and technologies targeting mission-level effects; Automaton will be used, extended, demonstrated/trained on, and freely shared to accomplish these goals and collaborate with others to drive our Department’s shared mission forward. This tutorial will provide an overview of Automaton’s capabilities (first 30 min, for Action Officers and Senior Leaders) as well as instruction on how to install and use the platform (remaining duration for hands-on-time with technical practitioners). Installation instructions are below and depend upon the user installing Windows Subsystem for Linux or having access to another Unix environment (e.g., macOS): Please install WSL V2 on your machines before the tutorial: https://learn.microsoft.com/en-us/windows/wsl/install Then please download/unzip the Automaton demo environment and place in your home directory: https://www.edaptive.com/dataworks/automaton_2023-04-18_dry_run_1.tarThen open up powershell from your home directory and type: wsl –import automaton_2023-04-18_dry_run_1 automaton_2023-04-18_dry_run_1 automaton_2023-04-18_dry_run_1.tar wsl -d automaton_2023-04-18_dry_run_13:00 PM – 3:20 PM
|
Break 3:20 PM – 4:20 PM: Parallel Sessions Room A ![]() Virtual Session
|
Session 6A: Tools for Decision-Making and Data Management Session Chair: Tyler Lesthaeghe, University of Dayton Research Institute
User-Friendly Decision Tools
Clifford Bridges (IDA) Show Bio
Clifford is formally trained in theoretical mathematics and has additional experience in education, software development, and data science. He has been working for IDA since 2020 and often uses his math and data science skills to support sponsors’ needs for easy-to-use analytic capabilities. Prior to starting at IDA, Clifford cofounded a startup company in the fashion technology space and served as Chief Information Officer for the company. Personal experience and anecdotal evidence suggest that presenting analyses to sponsors, especially technical sponsors, is improved by helping the sponsor understand how results were derived. Providing summaries of analytic results is necessary but can be insufficient when the end goal is to help sponsors make firm decisions. When time permits, engaging sponsors with walk-throughs of how results may change given different inputs is particularly salient in helping sponsors make decisions in the context of the bigger picture. Data visualizations and interactive software are common examples of what we call “decision tools” that can walk sponsors through varying inputs and views of the analysis. Given long-term engagement and regular communication with a sponsor, developing user-friendly decision tools is a helpful practice to support sponsors. This talk presents a methodology for building decision tools that combines leading practices in agile development and STEM education. We will use a Python-based app development tool called Streamlit to show implementations of this methodology.
Seamlessly Integrated Materials Labs at AFRL
Lauren Ferguson (Air Force Research Laboratory) Show Bio
Dr. Lauren Ferguson is the Digital Transformation Lead in the Materials & Manufacturing Directorate of the Air Force Research Laboratory in Dayton, OH. She earned her PhD in mathematics from Texas A&M University where she became interested in mathematical applications to materials science problems through an NSF fellowship. She spent eight years developing state-of-the-art simulation tools for composite materials that accurately model post-processing material state, capture complex damage patterns due to service loads and environments, and predict remaining life. For the last two years, she has pivoted to driving digital transformation efforts at AFRL, including facilitating pilot projects to seamlessly integrate labs for streamlined data collection and analysis, and to make Google Workspace and Cloud tools available to foster collaboration with global partners. One of the challenges to conducting research in the Air Force Research Laboratory is that many of our equipment controllers cannot be directly connected to our internal networks, due to older or specialized operating systems and the need for administrative privileges for proper functioning. This means that the current data collection process is often highly manual, with users documenting experiments in physical notebooks and transferring data via CDs or portable hard drives to connected systems for sharing or further processing. In the Materials & Manufacturing Directorate, we have developed a unique approach to seamlessly integrate our labs for more efficient data collection and transfer, which is specifically designed to help users ensure that data is findable for future reuse. In this talk, we will highlight our two enabling tools: NORMS, which assists users to easily generate metadata for direct association with data collected in the lab to eliminate physical notebooks; and Spike, which automates one-way data transfer from isolated systems to databases mirrored on other networks. In these databases, metadata can be used for complex search queries and data is automatically shared with project members without requiring additional transfers. The impact of this solution has been significantly faster data availability (including searchability) to all project members: a transfer and scanning process that used to take 3 hours can now take a few minutes. Future use cases will also enable Spike to transfer data directly into cloud buckets for in situ analysis, which would streamline collaboration with partners. Room B ![]() Virtual Session
|
Session 6B: Methods for DoD System Supply Chain and Performance Estimation Session Chair: Margaret Zientek, IDA
Applications of Network Methods for Supply Chain Review
Zed Fashena (IDA) Show Bio
Zed Fashena is currently a Research Associate in the Information Technology and Systems Division at the Institute for Defense Analyses. He holds a Master of Science in Statistics from the University of Wisconsin – Madison and a Bachelor of Arts in Economics from Carleton College (MN). The DoD maintains a broad array of systems, each one sustained by an often complex supply chain of components and suppliers. The ways that these supply chains are interlinked can have major implications for the resilience of the defense industrial base as a whole, and the readiness of multiple weapon systems. Finding opportunities to improve overall resilience requires gaining visibility of potential weak links in the chain, which requires integrating data across multiple disparate sources. By using open-source data pipeline software to enhance reproducibility, and flexible network analysis methods, multiple stovepiped data sources can be brought together to develop a more complete picture of the supply chain across systems.
Predicting Aircraft Load Capacity Using Regional Climate Data
Abraham Holland (IDA) Show Bio
Dr. Abraham Holland joined the Institute for Defense Analyses (IDA) in 2019 after completing his PhD in Public Policy at Harvard University. At IDA, Dr. Holland is an applied microeconomist that has led a range analyses across defense manpower, operations, and infrastructure topics. He is also a founding member of IDA’s climate and energy security working group, those researchers focused on supporting IDA’s capability to bring the best available climate science to today’s national security challenges. In this area, he has completed analyses on the potential impact of climate change on Department of Defense equipment, personnel, and operations. In addition to being a U.S. Air Force veteran, he received his undergraduate degree from Dartmouth College and graduated summa cum laude in economics and Chinese literature. While the impact of local weather conditions on aircraft performance is well-documented, climate change has the potential to create long-term shifts in aircraft performance. Using just one metric, internal load capacity, we document operationally relevant performance changes for a UH-60L within the Indo-Pacific region. This presentation uses publicly available climate and aircraft performance data to create a representative analysis. The underlying methodology can be applied at varying geographic resolutions, timescales, airframes, and aircraft performance characteristics across the entire globe. Room C
|
Session 6C: Statistical Methods for Ranking and Functional Data Types Session Chair: Elise Roberts, JHU/APL
An Introduction to Ranking Data and a Case Study of a National Survey of First Responders
Adam Pintar (National Institute of Standards and Technology) Show Bio
Adam earned a Ph.D. in Statistics from Iowa State University in 2010, and has been a Mathematical Statistician with NIST’s Statistical Engineering Division since. His primary focus is providing statistical and machine learning expertise and insight on multidisciplinary research teams. He has collaborated with researchers from very diverse backgrounds such as social science, engineering, chemistry, and physics. He is a Past Chair of the Statistics Division of the American Society for Quality (ASQ), he currently serves on the editorial board of the journal Transactions on Mathematical Software, and he is a member of the American Statistical Association and a senior member of ASQ. Ranking data are collected by presenting a respondent with a list of choices, and then asking what are the respondent’s favorite, second favorite, and so on. The rankings may be complete, the respondent rank orders the complete list, or partial, only the respondent’s favorite two or three, etc. Given a sample of rankings from a population, one goal may be to estimate the most favored choice from the population. Another may be to compare the preferences of one subpopulation to another. In this presentation I will introduce ranking data and probability models that form the foundation for statistical inference for them. The Plackette-Luce model will be the main focus. After that I will introduce a real data set containing ranking data assembled by the National Institute of Standards and Technology (NIST) based on the results of a national survey of first responders. The survey asked about how first responders use communication technology. With this data set, questions such as do rural and urban/suburban first responders prefer the same types communication devices, can be explored. I will conclude with some ideas for incorporating rankning data into test and evaluation settings.
Estimating Sparsely and Irregularly Observed Multivariate Functional Data
Maximillian Chen (Johns Hopkins University Applied Physics Laboratory) Show Bio
Max Chen received his PhD from the Department of Statistical and Data Sciences at Cornell University. He previously worked as a senior member of technical staff at Sandia National Laboratories. Since December 2019, Max is a Senior Professional Staff member at the Johns Hopkins University Applied Physics Laboratory. He is interested in developing novel statistical methodologies in the areas of high-dimensional data analysis, dimension reduction and hypothesis testing methods for matrix- and tensor-variate data, functional data analysis, dependent data analysis, and data-driven uncertainty quantification. With the rise in availability of larger datasets, there is a growing need of tools and methods to help inform data-driven decisions. Data that vary over a continuum, such as time, exist in a wide array of fields, such as defense, finance, and medicine. One such class of methods that addresses data varying over a continuum is functional data analysis (FDA). FDA methods typically make three assumptions that are often violated in real datasets: all observations exist over the same continuum interval (such as a closed interval [a,b]), all observations are regularly and densely observed, and if the dataset consists of multiple covariates, the covariates are independent of one another. We look to address violation of the latter two assumptions. In this talk, we will discuss methods for analyzing functional data that are irregularly and sparsely observed, while also accounting for dependencies between covariates. These methods will be used to estimate the reconstruction of partially observed multivariate functional data that contain measurement errors. We will begin with a high-level introduction of FDA. Next, we will introduce functional principal components analysis (FPCA), which is a representation of functions that our estimation methods are based on. We will discuss a specific approach called principal components analysis through conditional expectation (PACE) (Yao et al, 2005), which computes the FPCA quantities for a sparsely or irregularly sampled function. The PACE method is a key component that allows us to estimate partially observed functions based on the available dataset. Finally, we will introduce multivariate functional principal components analysis (MFPCA) (Happ & Greven, 2018), which utilizes the FPCA representations of each covariate’s functions in order to compute a principal components representation that accounts for dependencies between covariates. We will illustrate these methods through implementation on simulated and real datasets. We will discuss our findings in terms of the accuracy of our estimates with regards to the amount and portions of a function that is observed, as well as the diversity of functional observations in the dataset. We will conclude our talk with discussion on future research directions. Room E ![]() Virtual Session
|
Mini-Tutorial 6 Session Chair: Elizabeth Gregory, NASA Langley Research Center
An Introduction to Uncertainty Quantification for Modeling & Simulation
James Warner (NASA Langley Research Center) Show Bio
Dr. James (Jim) Warner joined NASA Langley Research Center (LaRC) in 2014 as a Research Computer Engineer after receiving his PhD in Computational Solid Mechanics from Cornell University. Previously, he received his B.S. in Mechanical Engineering from SUNY Binghamton University and held temporary research positions at the National Institute of Standards and Technology and Duke University. Dr. Warner is a member of the Durability, Damage Tolerance, and Reliability Branch (DDTRB) at LaRC, where he focuses on developing computationally-efficient approaches for uncertainty quantification for a range of applications including structural health management, additive manufacturing, and trajectory simulation. Additionally, he works to bridge the gap between UQ research and NASA mission impact, helping to transition state-of-the-art methods to solve practical engineering problems. To that end, he has recently been involved in efforts to certify the xEMU spacesuit and develop guidance systems for entry, descent, and landing for Mars landing. His other research interests include machine learning, high performance computing, and topology optimization. Predictions from modeling and simulation (M&S) are increasingly relied upon to inform critical decision making in a variety of industries including defense and aerospace. As such, it is imperative to understand and quantify the uncertainties associated with the computational models used, the inputs to the models, and the data used for calibration and validation of the models. The rapidly evolving field of uncertainty quantification (UQ) combines elements of statistics, applied mathematics, and discipline engineering to provide this utility for M&S. This mini tutorial provides an introduction to UQ for M&S geared towards engineers and analysts with little-to-no experience in the field but with some knowledge of probability and statistics. A brief review of basic probability will be provided before discussing some core UQ concepts in more detail, including uncertainty propagation and the use of Monte Carlo simulation for making probabilistic predictions with computational models, model calibration to estimate uncertainty in model input parameters using experimental data, and sensitivity analysis for identifying the most important and influential model inputs parameters. Examples from relevant NASA applications are included and references are provided throughout to point viewers to resources for further study. 4:20 PM – 4:40 PM
|
Room A+B ![]() Virtual Session
Awards Army Wilks Award Wilks Award Winner to Be Announced ASA SDNS Student Poster Awards Student Winners to be Announced DATAWorks Distinguished Leadership Award Winner To Be Announced 4:40 PM – 4:50 PM
|
Room A+B ![]() Virtual Session
Closing Remarks General Norty Schwartz (U.S. Air Force, retired / Institute for Defense Analyses) Show Bio
Norton A. Schwartz serves as President of the Institute for Defense Analyses (IDA), a nonprofit corporation operating in the public interest. IDA manages three Federally Funded Research and Development Centers that answer the most challenging U.S. security and science policy questions with objective analysis leveraging extraordinary scientific, technical, and analytic expertise. At IDA, General Schwartz (U.S. Air Force, retired) directs the activities of more than 1,000 scientists and technologists employed by IDA. General Schwartz has a long and prestigious career of service and leadership that spans over 5 decades. He was most recently President and CEO of Business Executives for National Security (BENS). During his 6-year tenure at BENS, he was also a member of IDA’s Board of Trustees. Prior to retiring from the U.S. Air Force, General Schwartz served as the 19th Chief of Staff of the U.S. Air Force from 2008 to 2012. He previously held senior joint positions as Director of the Joint Staff and as the Commander of the U.S. Transportation Command. He began his service as a pilot with the airlift evacuation out of Vietnam in 1975. General Schwartz is a U.S. Air Force Academy graduate and holds a master’s degree in business administration from Central Michigan University. He is also an alumnus of the Armed Forces Staff College and the National War College. He is a member of the Council on Foreign Relations and a 1994 Fellow of Massachusetts Institute of Technology’s Seminar XXI. General Schwartz has been married to Suzie since 1981. |