ReClass

Author	SHA1	Message	Date
charlie-rasberry	01e2142276	Fixed a few issues with performance data collection and debugging output, mtl training is ready, moving on to single-task training to compare in write-up	2026-02-26 17:40:37 +00:00
charlie-rasberry	df6aec7165	ignore runs	2026-02-23 16:29:44 +00:00
charlie-rasberry	9467ea2519	ignore tensorboard logs and model checkpoints	2026-02-23 16:28:43 +00:00
charlie-rasberry	4f0c54fe28	Added training loop for the MTL architecture on the original distribution	2026-02-23 16:26:48 +00:00
charlie-rasberry	7bd68108d0	Implemented initial training structure, adding further logic soon including loss, stopping, optimisation and loop	2026-02-23 12:54:23 +00:00
charlie-rasberry	76d9b8509b	Model almost complete, need to work on loss functions soon	2026-02-20 19:17:22 +00:00
charlie-rasberry	cccd91a680	Small bit of progress towards model.py, now building forward()	2026-02-20 18:18:17 +00:00
charlie-rasberry	61df4e3e26	Implemented dataset.py which tokenises and returns tensors, ready to load the model now	2026-02-19 22:10:25 +00:00
charlie-rasberry	19c0d4bce3	Started dataset.py, added the ReviewDataset class and implemented the __init__, __len__ and __getitem__ methods. The __getitem__ method currently just returns the review text, but will be updated to return the tokenized review as a tensor	2026-02-19 18:45:55 +00:00
charlie-rasberry	19bcf2aa18	Started dataset.py, added the ReviewDataset class and implemented the __init__, __len__ and __getitem__ methods. The __getitem__ method currently just returns the review text, but will be updated to return the tokenized review as a tensor	2026-02-19 18:41:37 +00:00
charlie-rasberry	c5e91b79b2	Decided on max_length by finding out how many and which reviews would be truncated (it will be 256 tokens)	2026-02-19 01:28:10 +00:00
charlie-rasberry	0be7da2dde	Finally processed the data fully and tested. Moving on to dataset.py and model.py	2026-02-19 00:44:36 +00:00
charlie-rasberry	608588f023	Preprocessed tagged datasets, fixed CSV formatting issues, and added integrity checks. Also saved mappings for later inference use.	2026-02-18 22:36:58 +00:00
charlie-rasberry	94a9fa1f17	gitignore change	2026-02-16 16:51:32 +00:00
Charlie Rasberry	8dbc5e7fc1	Remove duplicate repository structure heading Removed redundant repository structure section.	2026-02-16 12:42:16 +00:00
Charlie Rasberry	c006b2fcff	Fix formatting in README for repository structure	2026-02-16 12:41:53 +00:00
charlie-rasberry	b88504725d	cleaned notebooks, finished datalabelling	2026-02-16 12:36:29 +00:00
charlie-rasberry	8d3dee6d30	House Cleaning	2026-01-28 16:41:27 +00:00
charlie-rasberry	6cf36faf64	Line ending issue with my setup	2025-12-19 07:19:02 +00:00
charlie-rasberry	487be5cd27	Everything is good to go for annotations.	2025-12-19 07:14:13 +00:00
charlie-rasberry	5b9fbfc75e	data processing pipeline now finished just need to annotate reviews	2025-11-22 09:41:12 +00:00
charlie-rasberry	45ec02fa46	Moving on to multitag.py, sampling complete I think	2025-11-12 06:21:16 +00:00
charlie-rasberry	2cbdd55243	Fixed get_stratified_sample() and replace broken x() with actual working logic, added sample_with_keywords().	2025-11-12 02:05:20 +00:00
charlie-rasberry	a178284ffc	Added multitag.py (65% complete), preprocess.py (complete), sampler.py (80% complete)	2025-11-09 01:45:09 +00:00
charlie-rasberry	4d6e2511e6	Added multitag, includes preprocess.py, sampler.py and multitag.py(the main gui for labelling/annotation)	2025-11-06 17:40:29 +00:00
charlie-rasberry	c0d4c13824	Ignore large CSV data files	2025-11-06 17:39:26 +00:00
charlie-rasberry	cf6f2d8371	uber review	2025-10-08 00:19:49 +01:00
charlie-rasberry	3c51a51331	initial commit	2025-10-07 23:47:48 +01:00

28 Commits