Comparative Analysis of Cloning-Hashing Applications for Securing Digital Evidence

Abstract

cloning; hashing; evidence; security The development of the Internet has resulted in an increasing variety of cyber crimes. Cybercrime is closely related to digital evidence, so cybercriminals tend to delete, hide, and format all collected data to eliminate traces of digital evidence. This digital evidence is very vital in proving at trial, so it is necessary to develop applications to secure digital evidence. This study aims to compare the results of cloning and hashing in securing digital evidence and evaluate the performance of a digital forensic application developed under the name Clon-Hash Application v1 compared to applications commonly used by investigators including Autopsy, FTK Imager, md5.exe in terms of its function, the result, CPU usage. The results of the research conducted show that the cloning process is perfectly successful, as evidenced by the hash value results which are the same as paid applications and there are even several other applications that have not been able to display the hash value. Hash values in the Clon-Hash v1 application also vary from MD5, SHA1, and SHA256 which do not exist in other applications. Applications developed are better in terms of function, results, and CPU usage. In the legal process, it is very important to critically evaluate the quality and authenticity of the evidence in order to avoid an unjustified decision. The accuracy of digital forensic science has always been a concern, and still is; it is under debate. The National Institute of Standards Technology (NIST) USA [9] as a reference institution for digital forensics investigations and digital evidence handling in the world strongly criticizes all traditional analytical techniques such as DNA matching, fingerprints, bite marks, and firearm marks, footprints, and more. They refuse to recognize it as an exact and exact science. In addition, it is emphasized to establish clear scientific standards to verify the validity and reliability of digital forensics methods and allow the use of only scientifically proven methods in court, advocating the use of best practices in digital forensics. Digital forensics methods, which do not have proper authentication and solid statistical basis to justify different reasons and results are unacceptable. In addition, methods that do not meet the standards of systematic, objectivity, and independence are also not approved in current legal practice [10].
This Digital Evidence faces difficulties in meeting the standards of scientific criteria in court [11]. The lack of trust in digital forensics processes and the absence of a set set of rules for evaluation provide a smooth and accessible avenue for defense attorneys to challenge the evidence in courtrooms [12]. They found several loopholes in the process of collecting and comparing evidence to create reasonable doubts over the accuracy and credibility of the evidence [13]. Some cases also dispute the validity of digital forensics applications and methods. In digital forensics investigations, the authenticity and integrity of the evidence referred to as digital evidence is very important so it encourages the need to give critical attention to the process or procedure used in obtaining the evidence [6]. This study aims to explore applications in digital evidence maintenance such as cloning and hashing. Analyze the results of cloning and hashing of several relevant applications to scientifically validate the security of digital evidence.
Digital evidence has been recognized in accordance with the Law of the Republic of Indonesia Number 19 of 2016 concerning Electronic Information and Transactions, that electronic information and/or electronic documents and/or their printed results are legal evidence [14]. Digital evidence can be found on computer hard disks, cell phones, iPods, pen drives, digital cameras, CDs, DVDs, diskettes, computer networks, the Internet, and more [15]. Digital evidence that can be used as evidence, such as URL addresses that have been visited, e-mail messages or a collection of registered email addresses, word processing programs or extension formats used, spreadsheet documents used, image formats used if found, deleted or formatted files, passwords, windows registry, hidden files, event viewers logs, and application logs, including checking metadata [16]. Digital evidence can also be linked to online banking fraud, online stock trading fraud, source code theft, credit card fraud, tax evasion, virus attacks, cyber sabotage, phishing attacks, email hijacking, denial of service, hacking, murder cases, organized crime, operations terrorists, insults, pornography, extortion, smuggling.
According to the U.S. Department of Justice [17] [18] there are three things in obtaining digital evidence: a). Measures taken in securing and collecting digital evidence p-ISSN: 2087-1627, e-ISSN: 2685-9858 must not affect the integrity of the data; b). In testing digital evidence must be someone who has been trained; c). All activities related to the retrieval, testing, storage or transfer of digital evidence must be documented and subject to retesting.
According to J. Richter [19], there are five characteristics of digital evidence, namely: Admissible (Decent and acceptable), Authentic (Original), Complete (Complete), Reliable (Available) and Believable (Trusted). The explanation for each characteristic of digital evidence is as follows: a). Admissible, Digital evidence must be in accordance with the facts and problems that occur. The evidence submitted must also be accepted and used for the sake of law, starting from research interests to court x; b). Authentic (Original), Evidence must have a clear legal relationship with the case being investigated and evidence is not the result of engineering. Digital evidence must be proven in court that the evidence is original and has never been altered; c). Complete, The evidence must be complete and be able to prove the evil acts committed by the perpetrators of the crime. The evidence collected is not enough just based on one perspective of an event that took place. The information that has been collected, for example, is in the form of a trace entering a system, so the data collected is not only the log data of the perpetrator of the crime, but all the logs that enter the system, because it could actually be that before the perpetrator committed a crime, there were other people who helped him and/or there who committed the crime before the first perpetrator; d). Reliable The evidence collected must be available and reliable. This evidence must be able to be used as evidence in court and the process of collecting evidence and analysis carried out must be in accordance with procedures; e). Believable (Trusted) Evidence and presentations made in court must be understandable by judges and trustworthy. It is useless to present evidence in court such as about binary if the judge does not understand it. The process of submitting evidence in court must use lay language that can be understood by judges.
Several studies regarding the handling of digital evidence, research [20] implements the concept of inventory data, namely the concept of management of physical evidence through control of physical evidence and all activities related to physical evidence can be maintained and can be well documented. This research focuses on data storage that still uses a DBMS so that the physical evidence stored is still visible. Further development is focused on the security of stored data using either encryption or other methods. An integrated system is needed between physical evidence and digital evidence storage systems. Subsequent research on digital evidence, namely research on the application of multi smart contracts, explains that digital evidence has different characteristics and different information details between one type of digital evidence, images, audio, video, and documents or other types of digital evidence. The addition of more detailed information on digital evidence with Multi Smart Contracts can improve the integrity and accuracy of digital evidence and can help simplify and speed up investigators or experts in determining actions in examining managed digital evidence. This research [21] builds middleware by connecting naive chains to multi smart contracts using an API in the form of URLs that are accessed using the curl function on multi smart contracts.
Further research, regarding the flexibility of a repository based on SAN (Storage Area Network) and webbased technology is used as a network-based centralized storage architecture. This system is expected to help law enforcement agencies in terms of traceability management for digital evidence. Subsequent research on digital evidence, namely research that results in developing a proposed digital evidence management system can improve the accessibility of digital evidence which is very influential in the investigation process. Starting from the stage of submitting digital evidence, downloading digital evidence of storage to downloading the chain of custody form, it is done online without having to come to the digital forensic laboratory so that the investigation process runs faster and minimizes the risk of damage to digital evidence and the safety of officers. This research also focuses on dealing with COVID-19 because it can be accessed online by officers. Although there are many studies related to the handling of digital evidence, there is currently no specific research on the development of applications for digital evidence maintenance for cloning and hashing processes. Based on these reasons, this research was carried out with the aim of developing applications for digital evidence maintenance, especially the cloning and hashing processes, because generally investigators only use pre-existing applications.

Method 2.1 Methodology
Digital forensics research methods in this study adopted digital forensics methods from the National Institute of Standards Technology (NIST). Digital forensics method is used to describe the stages of forensics that will be carried out, can find out the flow of research in a structured manner and can be a reference in solving existing problems. The digital forensic steps can be seen in Figure 2. Based on Figure 2, the digital forensic process defined by the National Institute of Standards and Technology (NIST) consists of the following phases: 1) Collection, the purpose of which is to identify potential sources of data relevant to events and then label and record them. After that, the data located at the source must be obtained while maintaining the integrity of the source. 2) Examination, which involves assessing the data obtained from the Data collection and extracting the Phase relevant to the incident while maintaining its integrity. In this inspection phase, there are two processes, namely cloning and hashing. This cloning is to duplicate the digital evidence according to the original, then the hashing process itself to validate the digital evidence and whether it really matches the original or not. 3) Analysis, which involves studying the information extracted by examination to answer the 5WH question or determine that no or partial conclusions can be drawn. 4) Reporting, namely the process of preparing and presenting procedures, methods, and tools used in the investigation along with the results of the analysis phase.

Case Scenario
This research requires a scenario to get validation about the cloning-hashing application by comparing existing applications. Applications used such as FTK Imager, Autopsy, md5sum, and developed applications. The scenario stages include collecting data from different file types, then the data is duplicated which is then carried out by hashing, the application used for testing up to CPU usage or resources on the device used. Basically in digital forensics artifacts from the use of applications can also potentially damage evidence.  The test scenario is shown in Figure 3. An explanation of the files tested with several applications used in this study is shown in table 1. Applications are tested and compared with the results of cloning and hashing along with resource usage.

Result and Discussion
The result of this research is to compare analysis performance of digital forensics applications in cloning, hashing, and resource performance in maintaining digital evidence using the National Institute of Standards Technology (NIST) framework.

Collection Phase
At the stage of collecting the target files are files with different extensions and different capacities as well. There is a capacity of 18 GB which may contain potential evidence. In this file collection stage later all files will be tested from the cloning results and hash values, scenarios including through cloning and hashing results from Autopsy, FTK Imager, and Clon-Hash v1 Applications. After collecting the next step is to maintain and examine the digital evidence. 8 files with different capacities and extensions are also shown in figure 4.

Examination Phase
The Testing Phase is the stage where digital evidence is tested from the evidence that has been collected in the original file in accordance with what was obtained at the crime scene (TKP). This stage is a validation process by securing digital evidence by cloning the original evidence and comparing the cloned results with the hash value because for the integrity of the file or file it must be authentic according to the original. The application developed to maintain digital evidence is shown in figure 5.  Figure 5 shows the results of cloning and hashing of the 7-ntfs-undel.dd file, the source file from the "journal" folder to the destination "proof" folder with the hash value of MD5 the same, SHA256 and SHA1 showing the same results. MD5 hash value "e7dbb96759d9cd62b729463ebfe61dab". The next series tests the performance of the resource when running the Clone-Hash v1 application, which only recently running there is no cloning and hash process because the cloning and hash process should not leave artifacts that can eliminate potential evidence. This is shown in Figure 6.  Figure 6 shows that when the application is run it is still low and small. This proves that the application is not too burdensome for the performance of device resources. Seen only 1.5MB for memory and CPU 0%. The next process compares with other applications such as FTK Imager.  Figure 7 shows FTK Imager during the cloning and hashing process, the hash process on FTK Imager itself must export the hash value and requires another application to open the result of the hash value. The hash value after being exported and opened with another application, the MD5 hash value of the 7-ntfs-undel.dd file, is "e7dbb96759d9cd62b729463ebfe61dab". Figure 8 shows that when the FTK Imager application used memory capacity is 55.5 MB, and the CPU usage is 0%. This proves that the application takes up quite a lot of memory resources, this does not rule out the possibility of having an impact on digital evidence that has the potential to become primary evidence. The next process is to run the Autopsy Application. Autopsy application uses 1,179.4 MB memory capacity, and 0% CPU usage. This proves that the application takes up quite a lot of memory resources, even up to 1GB, this does not rule out the possibility of having an impact on digital evidence that has the potential to become primary evidence. p-ISSN: 2087-1627, e-ISSN: 2685-9858 Figure 9. the Results of hash values md5 application Figure 9 shows that the Autopsy application cannot read the hash value of the 7-ntfs-undel.dd file, so this application is not good for securing digital evidence. md5.exe application that runs on a windows terminal. This md5 application is classified as very little in taking resources including memory and CPU because it runs on the operating system terminal (cmd). This proves that the application is not too burdensome for the performance of device resources. The MD5 hash value of the 7-ntfs-undel.dd file is "e7dbb96759d9cd62b729463ebfe61dab". Figure 9 shows that the Autopsy application cannot read the hash value of the 7-ntfs-undel.dd file, so this application is not good for securing digital evidence. md5.exe application that runs on a windows terminal. This md5 application is classified as very little in taking Table 2

Conclusion
This study concludes that the application developed under the name Clon-Hash v1 is successful. When compared with Autopsy, FTK Imager, this application uses less CPU resources. The Clon-Hash v1 application is better in terms of user interface because it makes it easier for users to operate when compared to md5.exe. The cloning test was successfully proven with the exact same hash value between the original file and the cloned file. Hash values also vary from MD5, SHA1, and SHA256 which do not yet exist in other applications, because the digital forensics standard is the MD5 hash. Application development research for the cloning process and further development has been carried out, with file type extensions, capacities, encryption passwords, and adding application comparators such as forensic toolkits, belkasoft, oxygen and others.