The last few years have brought many problems with artificial intelligence (AI), as businessmen and technologists worry about the enormous decision-making power they believe they have.
As a data scientist, I’m used to being the voice of reason about the possibilities and limitations of AI. In this article, I will explain how companies can use blockchain technology to manage model development, break through to better understand AI, make the model development process auditable, and identify and assign responsibility for AI solutions.
Use blockchain to manage model development
Although there is a widespread awareness of the need to manage AI, the discussion of how to do so is often vague, such as in “How to embed reporting in your AI”At the Harvard Business Review:
Evaluate management structures. A healthy AI management ecosystem must include governance processes and structures … AI accountability means seeking solid evidence of management at the organizational level, including clear goals and objectives for the AI system; well-defined roles, responsibilities and lines of power; a multidisciplinary workforce capable of managing AI systems; a wide range of stakeholders; and risk management processes. In addition, it is vital to look for elements at system level management, such as documented technical specifications of the specific AI system, compliance and stakeholder access to information on system design and operation.
This comprehensive list of requirements is enough to make every reader stand out. How exactly does an organization deal with obtaining “system-level controls” and providing “stakeholder access to information on system design and operation”?
Here is a real, applicable tip: Use blockchain technology to ensure that all decisions made about the AI model or machine learning are recorded and audited. (Full disclosure: In 2018, I filed a patent application in the United States [16/128,359 USA] about using a blockchain to manage model development.)
How blockchain creates audit
Developing an AI decision-making model is a complex process that involves countless incremental decisions – model variables, model design, training and test data used, feature selection, and more. All these solutions can be saved in the blockchain, which can also provide an opportunity to review the raw model of latent characteristics. You can also record in the blockchain all the scientists who have built different parts of the variable sets and who have been involved in creating the weight of the model and testing the model.
Model management and transparency are essential for building ethical AI technology that is auditable. As permitted by blockchain technology, the sum and overall record of these solutions provides the visibility needed to effectively manage models internally, attribute accountability, and meet regulators that are definitely coming for your AI.
Before the blockchain: Analytical models are spreading
Before blockchain became a buzzword, I began to apply a similar approach to analytics model management in my data science organization. In 2010, I started a development process focusing on the Analytical Tracking Document (ATD). This approach includes detailed model design, variable sets, assigned scientists, training and testing data, and success criteria, breaking down the entire development process into three or more agile sprints.
I realized that a structured approach to ATD was needed because I had seen too many negative results from what had become the norm in much of the financial industry: a lack of validation and accountability. Using banking as an example, a decade ago, the typical lifespan of an analytical model looked like this:
- A data scientist builds a model by independently choosing the variables it contains. This has led scientists to create redundant variables, not use validated variable design, and create new errors in the model code. In the worst cases, the data scientist can make decisions with variables that could introduce bias, model sensitivity, or goal expiration.
- When the same data scientist leaves the organization, his or her development directories are usually deleted, or if there are several different directories, it becomes unclear which directories are responsible for the final model. The bank often does not have the source code for the model or may have only parts of it. Just looking at the code, no one finally understands how the model is built, the data on which it is built, and the assumptions that are taken into account when building the model.
- Ultimately, the bank can be put in a high-risk situation, assuming the model is built correctly and will behave well – but without knowing it. The Bank is not in a position to validate the model or to understand under what conditions the model will be unreliable or unreliable. These realities lead to unnecessary risk or to the disposal and reconstruction of a large number of models, often repeating the above journey.
Blockchain to codify accountability
My patented invention describes how to codify the development of a model for analysis and machine learning using blockchain technology to connect a chain of objects, work tasks, and requirements to a model, including tests and validation checks. It reproduces much of the historical approach I used to build models in my organization – ATD essentially remains a contract between my scientists, managers and myself that describes:
- What is the model
- The goals of the model
- How will we build this model, including a prescribed machine learning algorithm
- Areas that the model needs to improve, such as 30% improvement card not present (CNP) credit card fraud at the transaction level
- The degrees of freedom that scientists have to solve the problem, and those who do not
- Reuse of trusted and validated variables and code snippets of the model
- Training data and test requirements
- Ethical AI procedures and tests
- Strength and stability tests
- Checklists for testing a specific model and validating the model
- Specifically appointed scientists-analysts to select the variables, to build the models and to train them and those who will validate the code, confirm the results, perform testing of the variables of the model and the output of the model
- Specific success criteria for the model and specific customer segments
- Specific analytical sprints, tasks and appointed scientists and official examinations / approvals of sprint requirements.
As you can see, ATD informs a set of requirements that are very specific. The team includes the direct modeling manager, the group of data scientists assigned to the project, and me as the owner of the flexible model development process. Everyone on the team signs the ATD as a contract, once we all agree on our roles, responsibilities, deadlines and requirements for construction. ATD becomes the document through which we define the whole process of developing a flexible model. It is then divided into a set of requirements, roles and tasks that are placed in the blockchain to be formally assigned, processed, validated and completed.
Having individuals tracked against each of the requirements, the team then evaluates a set of existing collateral, which is usually part of previously validated variable codes and models. Some variables have been approved in the past, others will be adjusted and others will be new. The blockchain then records each time the variable is used in this model – for example, each code that has been adopted by code stores is written new and changes are made – who did it, which tests were done, which manager modeling approved it, and my subscription.
Blockchain allows detailed tracking
It is important that the blockchain creates a clue to decision making. Indicates whether a variable is acceptable, whether it introduces a deviation into the model, or whether the variable is used correctly. The blockchain is not just a checklist of positive results, it is a record of the way these models are built – mistakes, corrections and improvements are recorded. For example, results such as failed ethical AI tests are stored in the blockchain, as well as the removal steps used to eliminate bias. We can see the journey on a very detailed level:
- The pieces of the model
- The way the model works
- The way the model responds to expected data, rejects bad data, or responds to a simulated changing environment
All these elements are codified in the context of who worked on the model and who approved each action. At the end of the project we can see, for example, that each of the variables contained in this critical model has been reviewed, placed in the blockchain and approved.
This approach provides a high level of assurance that no one has added a variable to the model that performs poorly or introduces some form of deviation into the model. It ensures that no one has used an incorrect field in their data specification or changed validated variables without permission and validation. Without the critical review process provided by ATD (and now the blockchain) to keep my data science organization auditable, my data scientists could inadvertently introduce an error model, especially when those models and related their algorithms are becoming more complex.
Travels to develop models that are transparent lead to less bias
In summary, superimposing the model development process on the blockchain gives the analytical model its own nature, life, structure, and description. The development of the model becomes a structured process, at the end of which detailed documentation can be prepared to ensure that all elements have passed the correct review. These elements can also be reviewed at any time in the future, providing key assets for use in model management. Many of these assets become part of the monitoring and monitoring requirements when the model is ultimately used instead of having to be discovered or appropriated after development.
In this way, the development and decision-making of the analytical model becomes auditable, a critical factor in the transformation of artificial intelligence technology, and data scientists who design it responsible – an essential step in removing bias from analytical models used for decision-making that affect people’s lives.
Scott Zoldie is the Chief Analyst at FICO is responsible for the analytical development of FICO’s product and technology solutions. While at FICO, Scott was responsible for the author of more than 110 analytical patents, 71 of which were granted and 46 pending. Scott is actively involved in the development of new analytical products and applications for big data analysis, many of which use new analytical innovations for streaming, such as adaptive analysis, collaborative profiling and self-calibrating analysis. Scott has recently focused on the application of streaming self-learning analytics to detect real-time cybersecurity attacks. Scott serves on two boards of directors, Software San Diego and Cyber Center of Excellence. Scott received his doctorate in theoretical and computational physics from Duke University.
The New Tech Forum provides a place to explore and discuss emerging corporate technologies in unprecedented depth and breadth. The choice is subjective, based on our choice of technologies that we consider important and of greatest interest to InfoWorld readers. InfoWorld does not accept marketing guarantees for publication and reserves the right to edit all content provided. Send all inquiries to email@example.com.
Copyright © 2022 IDG Communications, Inc.