In this tutorial, we will show how to launch ARCIMBOLDO_BORGES in order to
1. Create a library
2. Use that library against experimental diffraction data to solve a structure and to analyze the resulting output
This example uses the latest ARCIMBOLDO_BORGES released on November 2016.
Secondary structure prediction analysis is useful to define an initial hypothesis as to local folds present in our structure. For example, the secondary structure prediction may suggest the presence of beta sheets with at least three strands, even if we do not know if they will be parallel, antiparallel, or mixed parallel-antiparallel. In that case, BORGES offers the possibility of automatically performing initial assessment with subsets of libraries to prioritize starting hypotheses according to MR FOMs or follow the order of the most frequent folds.
In our case, as the protein is an immunoglobulin kappa-light chain domain, it must contain antiparallel β-sheets, and secondary structure prediction done with PSIPRED and shown in the figure agrees, so we will be creating and using an antiparallel beta-sheet library.
In order to solve a structure using ARCIMBOLDO_BORGES we need at least one library. This library will contain all superimposed models retrieved from the database that fulfill geometrical conditions defined by the user (i.e. 2 contiguous parallel alpha helices of 16aa or three antiparallel beta strands within set thresholds). The library is created with a novel algorithm in ARCIMBOLDO_BORGES that allows to extract not only alpha helices and beta strands, but also coils and loops. This new algorithm is still under development and will be described soon.
For this tutorial we will need only the reflection file in two formats (.mtz and .hkl). All required files, including the library, can be downloaded here. If you have an hkl file you can use the programs F2MTZ and TRUNCATE or generate a .sca file and use SCALEPACK2MTZ. On the contrary, if you have an mtz file you can use MTZ2HKL to get your hkl file. For runs of ARCIMBOLDO_BORGES on helical fragments you will need to provide the mtz file in P1 space group in order to be use Paterson Correlation Refinement of rotations.
[CONNECTION]: distribute_computing: local_grid setup_bor_path: /path/to/setup.bor [GENERAL]: working_directory: /path/to/working_directory mtz_path: %(working_directory)s/4l1h.mtz hkl_path: %(working_directory)s/4l1h.hkl [ARCIMBOLDO-BORGES] name_job: 4l1h molecular_weight: 13000 number_of_component: 1 rmsd: 0.2 i_label: IOBS sigi_label: SIGIOBS shelxe_line: -m30 -s0.6 -v0 -a6 -t10 -o library_path: /absolute/path/to/the/library/ prioritize_phasers: True
The [CONNECTION] section contains the information about the type of run and about the general configuration instructions. In this case, we are going to use the local grid defined in the configuration file setup.bor.
The molecular weight, number of component and percentage of identity should be defined for PHASER to perform our search.
As clusters in the library will be further grouped after the results of an initial fast rotation function (FRF), we can choose if we want to try all or select a subset (defined by a list of numbers separated by commas).
Arguments for the SHELXE command line must be given.
On the library path you have to put the absolute path to where the library of beta sheets is located.
To launch ARCIMBOLDO_BORGES you can again choose between:
2. In background:
nohup ARCIMBOLDO_BORGES conf_file.bor >& logfile.log&
On the working directory, an html file called as the job. It is written while the program is running and updated as results are obtained. You may use this information for manual intervention (stopping the run, reparameterizing). If left to run, ARCIMBOLDO_BORGES will sequentially try the best clusters (green ones in the html, based on number of models and figures of merit). Once a solution is found (SHELXE traced mainchain CC > 30%), it stops after some recycling steps to improve it.
What we have in our4l1h Output