[JIRA] Delete projects from XML backup using Python

[Experimental]

If you are a JIRA system administrator, have you ever come across the requirement of partially exporting JIRA into XML – probably for importing individual projects into another JIRA instance?

If yes, this might work for you. However, my work is incomplete and experimental. This worked with a small JIRA instance (100 issues), and failed with a huge JIRA instance (1 million issues).

Here’s the Python script I wrote for deleting projects from JIRA XML backup:

#!/usr/bin/python

from lxml import etree

#xp = etree.XMLParser(encoding='utf-8', recover=True)
#doc = etree.parse('entities.xml', xp)
doc = etree.parse ('entities.xml')

# Array of project keys
plist = ['ABC', 'DEF', 'GHI']

for x in plist:
    for p in doc.xpath ("//Project[@key!=\'" + x + "\']"):
        #p.get entities and delete
        pi = p.get("id")
        for i in doc.xpath ("//Issue[@project=\'" + pi + "\']"):
            ii = i.get("id")
            for cfv in doc.xpath ("//CustomFieldValue[@issue=\'" + ii + "\']"):
                cfv.getparent().remove(cfv)
            for cg in doc.xpath ("//ChangeGroup[@issue=\'" + ii + "\']"):
                cgi = cg.get("id")
                for ci in doc.xpath ("//ChangeItem[@group=\'" + cgi + "\']"):
                    ci.getparent().remove(ci)
                cg.getparent().remove(cg)
            for fa in doc.xpath ("//FileAttachment[@issue=\'" + ii + "\']"):
                fa.getparent().remove(fa)
            for il in doc.xpath ("//IssueLink[@source=\'" + ii + "\']"):
                il.getparent().remove(il)
            for il in doc.xpath ("//IssueLink[@destination=\'" + ii + "\']"):
                il.getparent().remove(il)
            for na in doc.xpath ("//NodeAssociation[@sourceNodeEntity=\'Issue\' and @sourceNodeId=\'" + ii + "\']"):
                na.getparent().remove(na)
            for wl in doc.xpath ("//Worklog[@issue=\'" + ii + "\']"):
                wl.getparent().remove(wl)
            for c in doc.xpath ("//Action[@issue=\'" + ii + "\']"):
                c.getparent().remove()
            i.getparent().remove(i)
        for na in doc.xpath ("//NodeAssociation[@sourceNodeEntity=\'Project\' and @sourceNodeId=\'" + pi + "\']"):
            na.getparent().remove(na)
        for c in doc.xpath ("//Component[@project=\'" + pi + "\']"):
            c.getparent().remove(c)
        for v in doc.xpath ("//Version[@project=\'" + pi + "\']"):
            v.getparent().remove(v)
        for a in doc.xpath ("//ProjectRoleActor[@pid=\'" + pi + "\']"):
            a.getparent().remove(a)
        for pk in doc.xpath ("ProjectKey[@projectId=\'" + pi + "\']"):
            pk.getparent().remove(pk)
        p.getparent().remove(p)

f = open ('entities.new.xml', 'w')
f.write (etree.tostring(doc, pretty_print=True, xml_declaration=True))
f.close()

Before you run this, you need to unzip the JIRA XML backup which is a zip archive containing two or three XML files. Put the script file in the same directory where you have the unzipped XML files and run it. It will produce output into file ‘entities.new.xml’. Once completed, replace ‘entities.xml’ by ‘entities.new.xml’, remove the Python script and re-pack the folder into a new zip file. Now it is ready to be imported into another JIRA instance.

Please note that this is incomplete and I cannot guarantee 100% success. You can easily follow the structure of this code and improve it by yourself.