Hive

From UCT EE Wiki
Revision as of 14:25, 9 March 2021 by Swinberg (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search


Overview[edit]

The Hive server is a single access point for a lot of EE resources.

Online Resources[edit]

HIVE hosts a shared SAMBA folder for communal software that might be useful across courses. The directory for the SAMBA folder is \\hive.ee.uct.ac.za\public on Windows and //hive.ee.uct.ac.za/public on Linux based systems.

Prerequisites[edit]

You need to be on the UCT network (or connected via the VPN), and have enabled the SAMBA protocol if required. For details on SAMBA, see Network Protocols.

Notes for Uploading[edit]

Public files should be posted at /home/public. Files in public should:

  • Be of group (and possibly also owner) sambashare
  • People with write access to /home/public without needing to use sudo to copy things in there should be in the sambashare group - but only trusted people who won't mess up other users things in public.
  • Have a README.txt to describe the contents and MD5 checksums
  • Ideally not be greater than 700mb (you can use WinRAR to split files). The reason for this is to allow files to still fit on CDs in case it is required.

MD5 Checksums[edit]

MD5 checksums can be provided to ensure large files are downloaded correctly. Programs can be downloaded to verify checksums, but likely the simplest is using the following python script. The method get_checksums will print out a list in the expected format that the script requires. The method verify_checksums will iterate over the "expected" list, look for those files, and verify the checksum. This code and documentation is available On the EE GitHub

The code is as follows:

"""
hashcheck.py

Generates and verifies MD5 hashes for files in the directory the script is placed in.

Keegan Crankshaw
April 2020
"""
import hashlib
import os

# Place your "expected" list here
# The method "get_checksums" will produce this list for all files in the current directory
expected = [
["example.exe", "<example.exe md5"],
["file2.rar", "<file2.rar md5>"],
]

def md5(fname):
    """
    Get the MD5 hashes for a file. If the file is large, process accordingly
    Taken from https://stackoverflow.com/a/3431838
    """
    hash_md5 = hashlib.md5()
    with open(fname, "rb") as f:
        for chunk in iter(lambda: f.read(4096), b""):
            hash_md5.update(chunk)
    return hash_md5.hexdigest()
    
def get_checksums():
    """
    Get the checksum for all files in the current directory
    """
    print ("expected = [")
    with os.scandir(".") as it:
        for entry in it:
            if entry.is_file():
                print("[\"{}\", \"{}\"],".format(entry.name, md5(entry.name)))
    print("]")
                
def verify_checksums():
    """
    Scan the current directory for files in the "expected" list and see if the hashes match
    """
    for f in expected: 
        if os.path.exists(f[0]):
            print("{} - Correct hash".format(f[0])) if md5(f[0]) == f[1] else print("{} Badly downloaded".format(f[0]))
        else:
            print("{} - File does not exist.".format(f[0]))

if __name__ == "__main__":
    # Adjust the line below to call the method you need
    get_checksums()