Government Releases Generative AI Trial Evaluation

DTA

The DTA embarked on a whole-of-government trial into generative artificial intelligence (AI). Insights from this trial evaluation will inform the ongoing safe and responsible adoption of generative AI across government.

From 1 January 2024 to 30 June 2024, the DTA coordinated the Australian Government's trials of a generative AI service. It made Microsoft 365 Copilot (formerly Copilot for Microsoft 365) available to over 7,600 staff across 60+ government agencies.

Our release of the Evaluation of the whole-of-government trial of Microsoft 365 Copilot outlines the positive reception to the efficiencies and outputs of the AI productivity tools across the Australian Public Service (APS). It highlights opportunities to explore more tailored solutions, strategies for uplifting functionality, and identifying areas that still need improvement.

'The DTA has always been clear in its goal to not shy away from innovation,' stressed Lucy Poole, General Manager of Strategy, Planning and Performance. 'To swiftly provide the APS with tools to experiment with generative AI in a safe and responsible way, we identified a tool that would quickly integrate into most workplaces with minimal disruption.'

Benefits for participants

Most trial participants (77%) were satisfied having an integrated AI tool, with many more (86%) wishing to continue using it. APS participants (69%) felt there was a marked improvement in the speed of wrapping up tasks, with nearly as many (61%) believing having such a tool enhanced the quality of work output.

'We were particularly pleased to see that participants saved an average of an hour of work a day handling note taking and administrative duties,' highlights Ms Poole. 'Even more pleasing was that 40% of that time was dedicated to building skills through activities like mentoring; or it was dedicated to staff engagement and team building.'

A majority of the efficiencies were seen in various business-as-usual and office management tasks. These included capturing notes, action items, minute taking, and kicking-off document writing or planning. Participants also felt comfortable exploring novel capabilities the tool could bring to their job function. This included programming and scripting, generating images and entire presentation decks, as well as searching and summarising databases.

The applications and sentiments relayed in the report are expected to rise as staff continue to be exposed to these tools. Confidence increased as participation in training increased. Specifically: 75% of participants who received 3 or more forms of training were confident in their ability to use the service which sat 28 percentage points higher than those who undertook only one form of training.

Opportunities to improve

While there was marked improvement in the delivery of general tasks, some feedback did point to the need to spend additional time reviewing content that was generated. This influenced some of the recommendations provided within the evaluation.

'As we are testing these tools at such an early stage, there are clear opportunities for tailored solutions to be developed that can handle highly technical material,' explains Ms Poole. 'The evaluation points to the importance of agencies carefully considering detailed and adaptable implementation of these solutions.'

'They should consider which generative AI solutions are most appropriate for their overall operating environment and their specific use cases. We're pleased that a lot of the recent work released by the DTA helps government agencies identify and address these very considerations.'

Earlier this week, the DTA published details about its trial of a draft Australian Government AI assurance framework and guidance. This draft framework outlines essential questions for government agencies to evaluate their AI use against Australia's AI Ethics Principles. It helps identify and mitigate AI risks throughout the AI lifecycle and ensures responsible AI usage.

This is part of our wider suite of AI in government resources, including the Policy for the responsible use of AI in government, it's underlying standards for accountable officials and transparency statements, as well as new Guidance for staff training on AI

Unexpected outcomes

Participants pointed towards various outcomes beyond the general scope of the trial. Many of these are either actively being considered, or will be integrated into the DTA's ongoing policy development work.

The potential for generative AI to improve inclusivity and accessibility in the workplace was of great interest to some agencies. Our recent work tries to arm the APS with the tools they need to fulfil this potential in generative AI. Agencies should look to ensure their AI solutions incorporate:

  • Fairness. Their systems should seek to minimise disproportionate impacts on individuals or groups
  • Reliability and safety. Special consideration should be given to the underlying data sources, risk identification and mitigation practices, as well as ongoing monitoring and intervention controls
  • Privacy protection and security measures. These ensure entities have the proper authority to use this data and that regular assessments are undertaken to determine any potential impacts

'We are highly aware of the realities of bias creeping into these services,' outlines Ms Poole. 'Bias can emerge in data when it's incomplete, unrepresentative, or mirrors societal prejudices. AI models might replicate these biases from their training data, resulting in misleading or unfair outputs, insights, or recommendations.'

'That is why we continue to reiterate the importance of keeping a human in the loop. This is why we raise the importance of transparency, explainability, and thorough review across so much of our guidance.'

There was also back-and-forth about the implications these solutions will have on attracting newcomers and entry level workers to the APS, as well as how it will transform positions throughout government.

'Keeping with new innovations ensures the APS remains a competitive workplace,' concludes Ms Poole. 'Making tools available that improve the efficiency and quality of some types of work goes a long way in retaining new talent.'

'This cannot be outweighed by losing key skills around processing and synthesising concepts or knowledge of the work undertaken by an agency. These tools should instead enhance strategic thinking and making connections between disparate pieces of work undertaken by government. And always with the aim of improving outcomes for all people and businesses.'

The Evaluation of the whole-of-government trial of Microsoft 365 Copilot is available in both executive report and full report formats.

Join our industry briefing

We have opened registrations to an industry brief outlining findings from the report. The session will take place 11am AEDT, Friday 25 October 2024.

This dive into the evaluation will provide details into:

  • learnings and challenges of implementing such capabilities at scale and adoption by staff
  • benefits and obstacles to productivity, efficiency, quality of output, and improvements to processes
  • staff sentiment about their satisfaction, uptake, and confidence using generative AI
  • any unexpected positive or negative implications on the adoption of these tools

Places are limited, so register as soon as possible. Presenters will take as many questions as they can on the day, but we recommend attendees submit early comments through the registration form.

/Public Release. This material from the originating organization/author(s) might be of the point-in-time nature, and edited for clarity, style and length. Mirage.News does not take institutional positions or sides, and all views, positions, and conclusions expressed herein are solely those of the author(s).View in full here.