T: 773-433-8114

  • Home
  • Business Services
  • Technology Services
  • Resources
  • Blog
  • Contact us

February 21, 2016 By wbadmin 2 Comments

Splitting a GMail mbox File by Label

There are often times that it is necessary to backup your gmail mail and import the resulting mbox file into your favorite email client. I use Mozilla Thunderbird as a powerful, yet free, email client. However, the mbox file that is exported by gmail is flat – all email from the gmail account are put into one mbox file. Writing a Thunderbird extension was quickly becoming too tedious. (The developer documentation is the worst documentation I have seen! But that is a rant for another time.) So, I decided to write a simple python script to split the mbox file into separate files for each label. It is not perfect (an email in gmail can have multiple labels but will only exist in one mbox file).

Without further ado, here is the script:

import sys
import getopt
import mailbox

def main(argv):
	in_mbox = "MikeBackupInbox.mbox"
	prefix = "MikeBackup"
	try:
		opts, args = getopt.getopt(argv, "i:p:", ["infile=", "prefix="])
	except getopt.GetoptError:
		print("python splitgmail.py -i  -p ")
		sys.exit(2)

	for opt, arg in opts:
		if opt in ("-i", "--infile"):
			in_mbox = arg
		elif opt in ("-p", "--prefix"):
			prefix = arg
	print("Processing file - " +in_mbox+" with prefix = "+prefix)
	boxes = {"inbox": mailbox.mbox(prefix+"Inbox.mbox", None, True), "sent": mailbox.mbox(prefix+"Sent.mbox", None, True),"archive":mailbox.mbox(prefix+"Archive.mbox", None, True)}

	for message in mailbox.mbox(in_mbox):
		gmail_labels = message["X-Gmail-Labels"]       # Could possibly be None.
		if not gmail_labels:
			boxes["archive"].add(message)
			continue
		gmail_labels = gmail_labels.lower()
		if "spam" in gmail_labels:
			continue
		elif "inbox" in gmail_labels:
			boxes["inbox"].add(message)
		elif "sent" in gmail_labels:
			boxes["sent"].add(message)
		else:
			saved = False
			for label in gmail_labels.split(','):
				if label != "important" and label != "unread" and label != "starred":
					box_name = prefix+label.title()+".mbox"
					if box_name not in boxes:
						boxes[box_name] = mailbox.mbox(box_name, None, True)
					boxes[box_name].add(message)
					saved = True
					break
			if not saved:
				boxes["archive"].add(message)

if __name__ == "__main__":
    main(sys.argv[1:])

Filed Under: Uncategorized

Comments

  1. Julietaigo says

    January 14, 2017 at 12:52 am

    Good post! I read your blog often and you always post excellent content. I posted this article on Facebook and my followers like it. Thanks for writing this!

    Reply
  2. LuisaLix says

    January 15, 2017 at 7:17 am

    whoa1h this blog is great i love reading your posts. Keep up the great work! You know, a lot of people are looking around for this information, you can aid them greatly.

    Reply

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Helpful Articles

  • Business Catalyst (1)
  • Consistent Customer Service (1)
  • Process Automation (2)
  • Uncategorized (2)

About Company

WB Optimum was established to help business owners get their operations under control, reduce stress, and spend time on the appropriate tasks.

Location

Chicago, IL
T: 773-819-0471

Copyright © 2021 ยท Wboptimum